Methods, compositions, and kits for nucleic acid analysis

ABSTRACT

Aspects of the disclosure relate to methods and kits for assessing cancer. Some aspects of the disclosure relate to methods and kits for preparing a sample library for sequencing. Some aspects of the disclosure relate to methods and kits for allele detection. Some aspects of the disclosure relate to high efficiency ligation methods and kits. Some aspects of the disclosure relate to sensitive detection of amplicons.

CROSS-REFERENCE

This application claims the benefit of U.S. Provisional PatentApplication No. 62/219,656 filed Sep. 16, 2015, U.S. Provisional PatentApplication No. 62/208,079, filed Aug. 21, 2015, and U.S. ProvisionalPatent Application No. 62/354,024, filed Jun. 23, 2016, whichapplications are herein incorporated by reference in their entireties.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy, created on Aug. 19, 2016, isnamed 44288-710.201_SL.txt and is 1,579,993 bytes in size.

BACKGROUND

Cancer poses serious challenges for modern medicine. In 2007, it hasbeen estimated that cancer caused about 13% of all human deathsworldwide (7.9 million). Cancer can encompass a broad group of variousdiseases that can involve unregulated cell growth. In cancer, cells candivide and grow uncontrollably, can form malignant tumors, and caninvade nearby parts of the body. Cancer can also spread to more distantparts of the body, for example, via the lymphatic system or bloodstream.There are over 200 different known cancers that afflict humans. Manycancers can be associated with mutations, for example, mutations incancer-related genes. The mutational status of a cancer can vary fromone individual subject to another, and even from one tumor cell toanother tumor cell in the same subject. Knowledge of these mutations canaid in the selection of cancer therapy, and can also aid in informingdisease prognosis and/or disease status. Provided herein are improvedmethods, compositions, and kits for detecting, monitoring, anddiagnosing cancer.

SUMMARY

Aspects of the disclosure relate to methods and kits for assessingcancer. Some aspects of the disclosure relate to methods and kits forpreparing a sample library for sequencing. Some aspects of thedisclosure relate to methods and kits for allele detection. Some aspectsof the disclosure relate to high efficiency ligation methods and kits.Some aspects of the disclosure relate to sensitive detection ofamplicons.

In some instances, an aspect of the present disclosure provides a methodfor nucleic acid library formation, said method comprising: (a) ligatinga single-stranded adaptor to a 5′ end of a single-stranded nucleic acidfragment, wherein said single-stranded adaptor is coupled to a solidsupport; annealing a target-specific oligonucleotide probe to a targetsequence in said nucleic acid fragment coupled to said solid support,wherein said target-specific oligonucleotide probe comprises a 3′ endthat anneals to said target sequence and a 5′ end comprising a secondadaptor sequence; extending said annealed target-specificoligonucleotide probe, thereby generating an extension product; andamplifying said extension product using a first primer comprisingsequence of said single-stranded adaptor and a second primer comprisingsequence of said second adaptor. The stranded adaptor can comprise anaffinity tag or a reactive moiety. The affinity tag or reactive moietycan comprise biotinyl-TEG, aminohexyl, or acrydite. In some cases, thesolid support comprises a paramagnetic material. In some cases, thesolid support comprises a streptavidin polystyrene bead, apolyacrylamide bead, a tosyl-activated carboxylated bead, or anNETS-activated carboxylated bead. The method can further comprisepurifying unligated single-stranded nucleic acid fragment from ligatedsingle-stranded nucleic acid fragment between step a) and step b). Thepurifying amplifying can comprise from about 1 to about 15 cycles ofpolymerase chain reaction (PCR). The method can further comprisecoupling said single-stranded adaptor to said solid support before stepa). The method can further comprise denaturing a double stranded nucleicacid to generate said single-stranded nucleic acid fragment of step a).The method can further comprise pre-adenylating said single-strandednucleic acid before step a).

In some aspects, the disclosure provides for a method for nucleic acidlibrary formation, said method comprising: (a) ligating a firstsingle-stranded adaptor to a 5′ end of a single-stranded nucleic acidfragment; (b) ligating a second single-stranded adaptor to a 3′ end ofsaid single-stranded nucleic acid fragment, thereby generating asingle-stranded nucleic acid fragment comprising a 5′ firstsingle-stranded adaptor and a 3′ second single-stranded adaptorfollowing step a) and step b); and (c) extending a primer annealed tothe second single-stranded adaptor to generate an extension product; (d)performing polymerase chain reaction to amplify the extension product,thereby generating amplified extension product; and (e) sequencing saidamplified extension product. The ligating of step a) can occur beforesaid ligating of step b), wherein said ligating of step a) can occur ina reaction mixture that lacks said second single-stranded adaptor. Theligating of step b) can occur before said ligating of step a), andwherein said ligating of step b) can occur in a reaction mixture thatlacks said first single-stranded adaptor. The method can furthercomprise pre-adenylating said second single-stranded adaptor before stepb). The method can further comprise phosphorylating a 5′ end of saidsingle-stranded nucleic acid fragment before step a). The method canfurther comprise pre-adenylating said single-stranded nucleic acidfragment before step a). The method can further comprise performing apurification step to remove unligated first-single stranded adaptorafter step a). The method can further comprise performing a purificationstep to remove unligated second-single stranded adaptor after step b).

In some aspects, the disclosure provides for a method of generating anucleic acid library, said method comprising: (a) ligating a firstsingle-stranded adaptor to a 3′ end of a single-stranded nucleic acidtemplate to generate a single-stranded template ligated to said firstsingle-stranded adaptor; (b) annealing a primer to said single-strandedadaptor ligated to said single-stranded nucleic acid template; (c)performing linear amplification using said primer to generate a linearamplification product comprising said primer and sequence complementaryto said single-stranded nucleic acid template; and (d) ligating a secondsingle-stranded adaptor to a 3′ end of said linear amplificationproduct. The first single-stranded adaptor can be from about 19 bases toabout 25 bases in length. The linear amplification can be performedunder isothermal conditions. The linear amplification can be performedwith Bst DNA polymerase. The linear amplification can be performed undercycling temperature conditions. The linear amplification can beperformed with a thermostable polymerase. The method can furthercomprise purifying said single-stranded nucleic acid template ligated tosaid first single-stranded adaptor after said ligation. The method canfurther comprise purifying said linear amplification product ligated tosaid second adaptor. The method can further comprise sequencing saidlinear amplification product ligated to said second adaptor.

In some aspects, the disclosure provides for a method of generating anucleic acid library, said method comprising: (a) annealing a primercomprising a 5′ phosphate to an RNA molecule; (b) extending said primerto generate a first cDNA strand; (c) ligating a first single-strandedadaptor to a 5′ end of said first cDNA strand, thereby generating afirst cDNA strand ligated to a first single-stranded adaptor; (d)annealing a target-specific oligonucleotide probe to a target sequencein said first cDNA strand ligated to a first single-stranded adaptor,wherein said target-specific oligonucleotide probe comprises a 3′ endthat anneals to said target sequence and a 5′ end comprising a secondadaptor; (e) extending said annealed target-specific oligonucleotideprobe, thereby generating an extension product; and (f) amplifying saidextension product using a first primer comprising sequence of said firstsingle-stranded adaptor and a second primer comprising sequence of saidsecond adaptor. The RNA can comprise mRNA. The primer can comprise arandom primer. The random primer can comprise a random hexamer sequence.The target sequence can comprise a gene sequence. The firstsingle-stranded adaptor and said second adaptor can be different. TheRNA molecule can comprise a junction between two genes resulting from agene fusion. The gene fusion can be associated with cancer.

In some aspects, described herein is a method for preparing a nucleicacid library, said method comprising: (a) ligating a firstsingle-stranded adaptor to a 5′ end of a single-stranded nucleic acidfragment to generate a single-stranded nucleic acid fragment comprisinga 5′ adaptor; (b) hybridizing a target-specific oligonucleotide probe toa target sequence in said single-stranded nucleic acid fragmentcomprising a 5′ adaptor to create a hybridization product, wherein saidtarget-specific oligonucleotide probe comprises a 3′ end that anneals tosaid target sequence and a 5′ end comprising a second adaptor; (c)extending said target-specific oligonucleotide probe annealed to saidtarget sequence to generate an extension product; and (d) amplifyingsaid extension product using a first primer comprising sequence of saidfirst single-stranded adaptor and a second primer comprising sequence ofsaid second adaptor, wherein said amplifying comprises performing from 2to 15 polymerase chain reaction (PCR) cycles in solution. The method canfurther comprise phosphorylating a 5′ end of a double stranded DNA anddenaturing said double stranded DNA to generate said single-strandednucleic acid fragment of step a), wherein said single-strand nucleicacid fragment comprises said 5′ phosphate. The single-strand nucleicacid fragment can comprise DNA. The DNA can comprise genomic DNA. Thesingle-stranded nucleic acid fragment can comprise RNA. The method canfurther comprise fragmenting RNA to generate said single-strandednucleic acid of step a). The method can further comprise phosphorylatinga 5′ end of said RNA before step a). The method can further comprisepre-adenylating said RNA before step a). The extending can be performedusing a reverse transcriptase. The method can further comprise degradingsaid RNA after step c). The method can further comprise pre-adenylatingsaid single-stranded nucleic acid before step a). The method can furthercomprise performing a purification step to remove unligated firstsingle-stranded adaptor between step a) and step b). The single-strandednucleic acid fragment can be a cell-free nucleic acid from a biologicalsample.

In some instances, an aspect of the present disclosure provides a methodcomprising: (a) identifying a set of sequences that anneal to sequencesin a nucleic acid sample; (b) generating a first set of primers based onthe set of sequences; (c) creating a first nucleic acid library byannealing the first set of primers to nucleic acid in a first samplefrom a subject; (d) performing massively parallel sequencing on thenucleic acid library to determine a profile of mutations in the nucleicacid library; (e) generating a second set of primers based on the set ofsequences in step a), wherein the second set of primers comprisesequences from a subset of primers in the first set of primers; and (f)analyzing a second sample from the subject using the second set ofprimers.

In some embodiments, the nucleic acid sample comprises a human DNAgenome. In some embodiments, the first set of primers anneal to genesmutated in a cancer. In another embodiment, the first set of primersanneal to genes mutated in more than one cancer. In another embodiment,the first set of primers anneal to genes mutated in a colon cancer, lungcancer, or breast cancer. Some embodiments of aspects provided hereinfurther comprise using the profile of mutations to determine potentialtherapies for the subject. In some embodiments of the aspects providedherein, the second sample is a cell-free DNA sample. In someembodiments, the cell-free DNA sample comprises plasma, urine, orcerebrospinal fluid, mucosal secretions, semen, saliva, amniotic fluidor a bodily fluid.

In some embodiments, the analyzing of step f) comprises massivelyparallel sequencing. In some embodiments, the analyzing of step f)comprises using the second set of primers to generate a second nucleicacid library from the second sample. In other embodiments, the analyzingof step f) comprises amplification. In some embodiments, theamplification comprises PCR. In some embodiments, the PCR comprisesdigital PCR. In some embodiments, the digital PCR comprises dropletdigital PCR.

In some embodiments of the aspects provided herein, a sequenceidentified in step a) is used to generate a primer in the second set ofprimers in which the 3′-most base of the primer overlays a singlenucleotide variant. In some embodiments, the second set of primerscomprises a primer in which a 3′-most base anneals to a wild-type alleleat a location of the single nucleotide variant and a primer in which a3′-most base anneals to mutant allele at the location of the singlenucleotide variant. In yet another embodiment, a sequence identified instep a) is used to generate a set of primers that span a breakpoint. Inyet another embodiment, step c) identifies a copy-number alteration at alocus and the second set of primers anneals to the locus. In yet anotherembodiment, the second set primers anneals to the locus and a third setof primers anneals to a reference locus that was not identified ashaving a copy-number alteration. In some embodiments, a sequenceidentified in step a) is detected at a decreased level compared to areference sequence by the massively parallel sequencing and the secondset of primers anneals to the sequence detected at the decreased level.In yet another embodiment, the second set of primers anneals to thesequence detected at the decreased level and a third set of primersanneals to a reference locus detected at a normal level.

In some embodiments of the aspects provided herein, any of the describedmethods further comprise monitoring an efficacy of a treatment providedto the subject over time. In yet another embodiment of any of thedescribed methods, the set of sequences comprise a sequence that annealsto TP53. In yet another embodiment of any of the described methods, thefirst set of primers comprises a sequence that anneals to TP53. In yetanother embodiment of any of the described methods, the second set ofprimers comprises a sequence that anneals to TP53. In some embodiments,the set of sequences anneal across a genome.

Another aspect of the present disclosure provides a method comprising:(a) generating a nucleic acid library from a first sample from asubject, wherein the sample comprises nucleic acid from a tumor; (b)performing massively parallel sequencing on the nucleic acid library todetermine a profile of mutations in the tumor; (c) detecting a presenceor absence of a mutation in the profile of mutations in a second samplefrom the subject by massively parallel sequencing, and, if the mutationis not detected by massively parallel sequencing, detecting a presenceor absence of the mutation using digital PCR.

In some embodiments, the digital PCR comprises droplet digital PCR. Inyet another embodiment, the mutation is not detected by massivelyparallel sequencing in step c). In some embodiments, the mutation is notdetected by massively parallel sequencing in step c) because it ispresent below a detection threshold of the massively parallelsequencing. In yet another embodiment, the mutation is not detected bymassively parallel sequencing in step c) and the mutation is notdetected by digital PCR in step c). In yet another embodiment, themethod further comprises analyzing for the mutation in a third samplefrom the subject, wherein the third sample is taken after the firstsample and the second sample.

In some embodiments, the mutation is detected in the third sample. Inyet another embodiment, the detection of the mutation in the thirdsample indicates a recurrence of cancer. In some embodiments, the methodfurther comprises resequencing the first sample by massively parallelsequencing. In some embodiments, the massively parallel sequencingcomprises use of reversibly terminating nucleotides.

Another aspect of the present disclosure provides a method forgenerating a reference material, the method comprising: (a) obtainingdeoxyribonucleic acid (DNA) extracted from two or more biologicalsamples; (b) mixing said DNA to produce a DNA mixture; (c) incubatingsaid DNA mixture with purified histones and chromatin assembly factors;and (d) fragmenting said DNA mixture to produce said reference sample.

In some embodiments, the method further comprises aliquoting andfreezing said reference sample. In some embodiments, the two or morebiological samples are cell lines from reference germline genomes. Insome embodiments, the DNA is mixed such that DNA from each of the two ormore biological samples is present in a known ratio. In someembodiments, DNA from one of said two or more biological samples ispresent in said DNA mixture at about 0.01 to about 0.5%. In yet anotherembodiment, DNA from one of said two or more biological samples ispresent in said DNA mixture at about 0.1 to about 0.5%. In yet anotherembodiment, DNA from one of said two or more biological samples ispresent in said DNA mixture at about 0.5 to about 1%. In yet anotherembodiment, DNA from one of said two or more biological samples ispresent in said DNA mixture at about 1% to about 5%. Another aspect ofthe present disclosure provides a method for generating a referencematerial, the method comprising: (a) isolating nucleic acid from a firstsample; (b) fragmenting nucleic acid from the nuclei; and (c) using thefragmented nucleic acid from the nuclei as a reference material forcell-free nucleic acid sample.

In some embodiments, the fragmenting comprises use of chromatin from thenuclei. In some embodiments, the fragmenting comprises use of an enzyme.In some embodiments, the enzyme comprises a DNase. In some embodiments,the method further comprising isolating nucleic from a second sample,fragmenting nucleic acid from the nuclei from the second sample, andmixing the fragmented nucleic acid from the first sample and thefragmented nucleic acid from the second sample to produce a referencematerial. In some embodiments the first sample comprises a non-cancerouscell.

Another aspect of the present disclosure provides a method forgenerating a reference material from cell-free nucleic acid, the methodcomprising: (a) inducing apoptosis or necrosis in a first sample; (b)extracting nuclei or other cell component comprising nucleic acid fromthe first sample; (c) using the nucleic acid from the nuclei or the cellcomponent as a reference for cell-free nucleic acid.

In some embodiments, the extracting comprises use of a detergent. In yetanother embodiment, the extracting comprises use of osmotic shock. Inyet another embodiment, extracting comprises use of differentialcentrifugation. In some embodiments, the method comprises extractingnuclei comprising nucleic acid from the first sample. In yet anotherembodiment, the method comprises extracting other cell componentcomprising nucleic acid from the first sample. In some embodiments, themethod further comprises mixing a second sample of nucleic acidfragments with the nucleic acid from the nuclei or the cell component,and using the mixture as a reference for cell-free nucleic acid. In yetanother embodiment, the method further comprises inducing apoptosis ornecrosis in a second tissue to generate the second sample of nucleicacid fragments.

Another aspect of the present disclosure provides a method forgenerating a reference material for cell-free nucleic acid, the methodcomprising: (a) isolating nucleic acid from a culture media; and (b)using the nucleic acid isolated from the culture media as a referencefor cell-free nucleic acid.

In some embodiments, the nucleic acid is from cells grown in the culturemedia. In some embodiments, the cells are human cells. In someembodiments, the human cells are a human cell line. In some embodiments,the human cell line is derived from tumor tissue.

Another aspect of the present disclosure provides a method for ligatingsingle-stranded donor nucleic acid molecules and single-strandedacceptor nucleic acid molecules, the method comprising: (a) transferringa nucleotide monophosphate (NMP) to the single-stranded donor nucleicacid molecules in a reaction mixture, thereby generating single-strandeddonor nucleic acid molecules comprising the NMP; (b) after step a),adding the single-stranded acceptor nucleic acid molecules to thereaction mixture; and (c) ligating the single-stranded acceptor nucleicacid molecules to the single-stranded donor nucleic acid moleculescomprising the NMP in the reaction mixture, wherein the reaction mixturein which the ligation occurs has a pH of at least pH 7.1, and wherein anefficiency of ligating the single-stranded donor nucleic acid moleculesis over 10%. In some embodiments, the pH is pH 7.1 to about pH 9.

Another aspect of the present disclosure provides a method for ligatingsingle-stranded donor nucleic acid molecules and single-strandedacceptor nucleic acid molecules, the method comprising: (a) transferringa nucleotide monophosphate (NMP) to the single-stranded donor nucleicacid molecules in a reaction mixture, thereby generating single-strandeddonor nucleic acid molecules comprising the NMP; (b) after step a),sedimenting a ligase complexed with the single-stranded donor nucleicacid molecules comprising the NMP; and (c) after step b), ligating thesingle-stranded acceptor nucleic acid molecules to the single-strandeddonor nucleic acid molecules comprising the NMP. In some embodiments, anefficiency of ligating the single-stranded donor nucleic acid moleculesis over 10%.

Another aspect of the present disclosure provides a method forgenerating a nucleic acid library comprising: (a) ligating a firstsingle-stranded adaptor to a 3′ end of a single-stranded template togenerate a single-stranded template ligated to the first single-strandedadaptor; (b) annealing a primer to the single-stranded adaptor ligatedto the single-stranded template; (c) performing linear amplificationusing the primer to generate a linear amplification product comprisingthe primer and sequence complementary to the template; and (d) ligatinga second adaptor to a 3′ end of the linear amplification product.

In some embodiments, the adaptor is from about 19 bases to about 25bases. In some embodiments, the linear amplification is performed underisothermal conditions. In some embodiments, the linear amplification isperformed with Bst DNA polymerase. In yet another embodiment, the linearamplification is performed under cycling conditions. In yet anotherembodiment, the linear amplification is performed with a thermostablepolymerase. In some embodiments, the method further comprises purifyingafter the single-stranded template ligated to the first single-strandedadaptor after the ligation. In yet another embodiment, the methodfurther comprises purifying the linear amplification product ligated tothe second adaptor. In yet another embodiment, the method furthercomprising sequencing the linear amplification product ligated to thesecond adaptor.

In some embodiments, the fragmenting is by a nuclease. In someembodiments, the nuclease is DNase I. In yet another embodiment, thefragmenting is by a nebulizer. In some embodiments, the reference samplehas a mean fragment length of about 140 to about 180 bases. In yetanother embodiment, the reference sample has a mean fragment length ofabout 150 to about 170 bases.

In some instances, the disclosure provides a method of assessing cancer,comprising: (a) determining the presence, absence, and/or amount of eachof a subset of genes in a sample derived from a sample from a subject,wherein the subset is determined by (i) performing targeted sequencingon a set of genes on a solid tissue sample from the subject wherein thesolid tissue sample is known or suspected of comprising canceroustissue; (ii) determining a profile of somatic genetic abnormalities forthe set of genes in the tumor based on the sequencing; and (iii)selecting a subset of 2, 3, or 4, but no more than 4 genes of the set ofgenes based on the profile for the set, wherein the subset is specificto the individual; and (b) from the results of step (a) determining thestatus of the cancer in the subject.

The method can comprise (a) determining the presence, absence, and/oramount of each of a subset of genes in a sample derived from a fluidsample in a subject, wherein the subset is determined by (i) performingtargeted sequencing on a set of genes from an unfixed or fixed solidtissue sample from the subject wherein the solid tissue sample is knownor suspected of comprising cancerous tissue; (ii) determining a profileof genetic abnormalities for the set of genes based on the sequencing;and (iii) selecting a subset of the set of genes based on the profilefor the set, wherein the subset is specific to the individual; and (b)from the results of step (a) determining the status of the cancer in thesubject.

The method can comprise (a) determining the presence, absence, and/oramount of each of a subset of genes in a sample derived from a fluidsample in a subject, wherein the subset is determined by (i) performingtargeted sequencing on a set of genes from a first fluid sample from thesubject wherein the first fluid sample is known or suspected ofcomprising nucleic acids from cancerous tissue; (ii) determining aprofile of genetic abnormalities for the set of genes based on thesequencing; and (iii) selecting a subset of the set of genes based onthe profile for the set, wherein the subset is specific to theindividual; and (b) from the results of step (a) determining the statusof the cancer in the subject.

In a related embodiment, the method comprises (a) determining thepresence, absence, and/or amount of each of a subset of genes in asample derived from a fluid sample in a subject, wherein the subset isdetermined by (i) performing targeted sequencing on a set of genes froma bodily fluid sample from the subject wherein the bodily fluid sampleis known or suspected of comprising tumor-derived nucleic acid; (ii)determining a profile of genetic abnormalities for the set of genesbased on the sequencing; and (iii) selecting a subset of the set ofgenes based on the profile for the set, wherein the subset is specificto the individual; and (b) from the results of step (a) determining thestatus of the cancer in the subject.

In practicing any of the methods described herein, the set of genescomprises at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200,300, 400, 500, 600, 700, 800, 900, or 1000 genes.

The set of genes can be selected from the group consisting of: ABCA1,BRAF, CHD5, EP300, FLT1, ITPA, MYC, PIK3R1, SKP2, TP53, ABCA7, BRCA1,CHEK1, EPHA3, FLT3, JAK1, MYCL1, PIK3R2, SLC19A1, TP73, ABCB1, BRCA2,CHEK2, EPHA5, FLT4, JAK2, MYCN, PKHD1, SLC1A6, TPM3, ABCC2, BRIP1, CLTC,EPHA6, FN1, JAK3, MYH2, PLCB1, SLC22A2, TPMT, ABCC3, BUB1B, COL1A1,EPHA7, FOS, JUN, MYH9, PLCG1, SLCO1B3, TPO, ABCC4, Clorf144, COPS5,EPHA8, FOXO1, KBTBD11, NAV3, PLCG2, SMAD2, TPR, ABCG2, CABLES1, CREB1,EPHB1, FOXO3, KDM6A, NBN, PML, SMAD3, TRIO, ABL1, CACNA2D1, CREBBP,EPHB4, FOXP4, KDR, NCOA2, PMS2, SMAD4, TRRAP, ABL2, CAMKV, CRKL, EPHB6,GAB 1, KIT, NEK11, PPARG, SMARCA4, TSC1, ACVR1B, CARD11, CRLF2, EPO,GATA1, KLF6, NF1, PPARGC1A, SMARCB1, TSC2, ACVR2A, CARM1, CSF1R, ERBB2,GLI1, KLHDC4, NF2, PPP1R3A, SMO, TTK, ADCY9, CAV1, CSMD3, ERBB3, GLI3,KRAS, NKX2-1, PPP2R1A, SOCS1, TYK2, AGAP2, CBFA2T3, CSNK1G2, ERBB4,GNA11, LMO2, NOS2, PPP2R1B, SOD2, TYMS, AKT1, CBL, CTNNA1, ERCC1, GNAQ,LRP1B, NOS3, PRKAA2, SOS1, UGT1A1, AKT2, CCND1, CTNNA2, ERCC2, GNAS,LRP2, NOTCH1, PRKCA, SOX10, UMPS, AKT3, CCND2, CTNNB1, ERCC3, GPR124,LRP6, NOTCH2, PRKCZ, SOX2, USP9X, ALK, CCND3, CYFIP1, ERCC4, GPR133,LTK, NOTCH3, PRKDC, SP1, VEGF, ANAPC5, CCNE1, CYLD, ERCC5, GRB2, MAN1B1,NPM1, PTCH1, SPRY2, VEGFA, APC, CD40LG, CYP19A1, ERCC6, GSK3B, MAP2K1,NQO1, PTCH2, SRC, VHL,APC2, CD44, CYP1B1, ERG, GSTP1, MAP2K2, NR3C1,PTEN, ST6GAL2, WRN, AR, CD79A, CYP2C19, ERN2, GUCY1A2, MAP2K4, NRAS,PTGS2, STAT1, WT1, ARAF, CD79B, CYP2C8, ESR1, HDAC1, MAP2K7, NRP2,PTPN11, STAT3, XPA, ARFRP1, CDC42, CYP2D6, ESR2, HDAC2, MAP3K1, NTRK1,PTPRB, STK11, XPC, ARID1A, CDC42BPB, CYP3A4, ETV4, HGF, MAPK1, NTRK2,PTPRD, SUFU, ZFY, ATM, CDC73, CYP3A5, EWSR1, HIF1A, MAPK3, NTRK3, RAD50,SULT1A1, ZNF521, ATP5A1, CDH1, DACH2, EXT1, HM13, MAPK8, OMA1, RAD51,SUZ12, ATR, CDH10, DCC, EZH2, HMGA1, MARK3, OR10R2, RAF1, TAF1, AURKA,CDH2, DCLK3, FANCA, HNF1A, MCL1, PAK3, RARA, TBX22, AURKB, CDH20, DDB2,FANCD2, HOXA3, MDM2, PARP1, RB1, TCF12, BAI3, CDH5, DDR2, FANCE, HOXA9,MDM4, PAX5, REM1, TCF3, BAP1, CDK2, DGKB, FANCF, HRAS, MECOM, PCDH15,RET, TCF4, BARD1, CDK4, DGKZ, FAS, HSP90AA1, MEN1, PCDH18, RICTOR, TEK,BAX, CDK6, DIRAS3, FBXW7, IDH1, MET, PCNA, RIPK1, TEP1, BCL11A, CDK7,DLG3, FCGR3A, IDH2, MITF, PDGFA, ROR1, TERT, BCL2, CDK8, DLL1, FES,IFNG, MLH1, PDGFB, ROR2, TET2, BCL2A1, CDKN1A, DNMT1, FGFR1, IGF1R, MLL,PDGFRA, ROS1, TGFBR2, BCL2L1, CDKN1B, DNMT3A, FGFR2, IGF2R, MLL3,PDGFRB, RPS6KA2, THBS1, BCL2L2, CDKN2A, DNMT3B, FGFR3, IKBKE, MPL,PDZRN3, RPTOR, TNFAIP3, BCL3, CDKN2B, DOT1L, FGFR4, IKZF1, MRE11A,PHLPP2, RSPO2, TNKS, BCL6, CDKN2C, DPYD, FH, IL2RG, MSH2, PIK3C3, RSPO3,TNKS2, BCR, CDKN2D, E2F1, FHOD3, INHBA, MSH6, PIK3CA, RUNX1, TNNI3K,BIRC5, CDX2, EED, FIGF, INSR, MTHFR, PIK3CB, SDHB, TNR, BIRC6, CEBPA,EGF, FLG2, IRS1, MTOR, PIK3CD, SF3B1, TOP1, BLM, CERK, EGFR, FLNC, IRS2,MUTYH, PIK3CG, SHC1, and TOP2A.

The fluid sample can be selected from the group consisting of: blood,serum, plasma, urine, sweat, tears, saliva, sputum, mucosal secretions,components thereof or any combination thereof. Steps (a) and (b) can beperformed at a plurality of time points to monitor the status of thecancer over time. One time point can be prior to a first administrationof a cancer therapy and a subsequent time point can be subsequent to afirst administration.

The method can further comprise generating a report communicating theprofile of genetic abnormalities for the set of genes and communicatingthe report to a caregiver. The report can comprise a list of one or moresomatic tumor aberrations of therapeutic relevance and possible therapycandidates based on the profile. The report can be generated within twoweeks from collection of the solid tissue sample. In some instances, thereport is generated within 1 week from collection of the solid tissuesample. In some embodiments, the report comprises single nucleotidesomatic mutations of the set of genes. In some embodiments, the reportcomprises small somatic insertion or deletions of two or more adjacentnucleotides in the sequence of the set of genes. In some embodiments,the report comprises somatic copy number alterations of the set ofgenes. In some embodiments, the report comprises of structural genomicalterations comprising the set of genes. In some embodiments, the reportcomprises a description of a therapeutic agent targeting a tumorcharacteristic derived from or marked by the presence of a tumor somaticmutation, or a therapeutic agent that is more effective in the presenceof the tumor characteristic derived from or marked by the tumor somaticmutation. The method can further comprise generating a reportcommunicating the profile of the subset of genes at each of theplurality of time points.

In some embodiments of any of the methods herein, the determiningcomprises the step of diluting nucleic acid molecules from the sampleinto discrete reaction volumes, wherein the discrete reaction volumescontain on average less than 10, 5, 4, 3, 2, or 1 nucleic acid moleculefrom the sample. In some embodiments the discrete reaction volumescontain 0-10 molecules of the nucleic acid from the sample. The discretereaction volumes can be droplets in an emulsion. The discrete reactionvolumes can further comprise primers for allelic discrimination of thegenetic abnormalities in the subset of genes. In some embodiments, genefusions can be detected by the use of primers that span a breakpoint. Insome cases, these primers are designed based on sequence date generatefrom nucleic acids from the tumor. In some embodiments, gene fusions arecan be detected by designing a first and second primer set that target afirst and second gene suspected to have undergone gene fusion, whereineach primer set is distinctly labeled. In such cases, digital dropletPCR can be performed on a sample with both primer sets. In someembodiments, relative to a reference sample that has not undergone agene fusion event; the sample comprising nucleic acids having undergonethe gene fusion event will have a greater proportion of droplets whereinthe distinct signals colocalize than a sample that does not comprise thegene fusion event.

Determining the status can comprise quantifying the number of nucleicacids harboring the genetic abnormalities in the subset of genes. Thestep of targeted sequencing can comprise preparing a DNA library fromthe solid tissue sample in less than 8, 7, 6, 5, or 4 hours. In someembodiments, preparing does not require exponential PCR amplificationprior to sequencing of the library. In some embodiments the preparingcomprises a linear amplification step. In some embodiments the preparingdoes not require amplification.

In some embodiments, the step of targeted sequencing comprises (a)ligating a single-stranded adaptor to a 5′ end of a single-stranded DNAfragment from a solid tissue sample, wherein the single-stranded adaptorcomprises a first adaptor sequence specific for coupling to a sequencingplatform; (b) contacting the single-stranded DNA fragment ligated to thesingle-stranded adaptor with a target-specific oligonucleotidecomprising (i) a region specific for a region of a cancer-related geneand (ii) a second adaptor sequence specific for coupling to a sequencingplatform; (c) performing a hybridization reaction to join the targetspecific oligonucleotides to a single-stranded DNA fragment containing aregion of complementarity to the target-specific oligonucleotide; (d)performing an extension reaction to create an extension productcomprising the region and comprising the second adaptor; and (e)sequencing the extension product. Contacting can occur with thetarget-specific oligonucleotide attached to a sequencing platform.Contacting can occur with the target-specific oligonucleotide covalentlyattached to a solid support. Contacting can occur with thetarget-specific oligonucleotide affinity bound to a solid support.Contacting can occur with the target-specific oligonucleotide free in asolution.

In some embodiments, the adaptors comprise barcodes that tag uniquetemplate molecules. In some embodiments, the sample can be amplified toobtain multiple redundant copies of the initial template molecules. Insome embodiments, the amplified nucleic acids can be sequenced. In someembodiments, the sequences derived from amplified nucleic acids derivedfrom the same initial template molecule are identified by their barcode.In some embodiments, reads representing copies derived from the sameinitial template molecules can be integrated to distinguish betweengenetic variations present in the template molecules and errors producedby nucleic acid amplification and sequencing.

In some aspects, the present disclosure provides methods and kits forthe sensitive detection of a mutation in a target polynucleotide. Thedisclosure provides an oligonucleotide primer, comprising aprobe-binding region and a template binding region. In some embodiments,the template binding region is at least 50% complementary to a templatenucleic acid suspected of harboring a mutation. In some embodiments, aportion of the template binding region at least partially overlays alocus of the suspected mutation. In some embodiments, theoligonucleotide primer upon hybridization to the template nucleic acidis extendable by a polymerase if the mutation is present but is notextendable by the polymerase if the mutation is not present. In someembodiments, the template binding region comprises a 3′ terminal regionthat overlays the mutation locus. In some embodiments, the 3′ terminalregion that overlays the mutation locus comprises 1, 2, 3, 4, 5, or morethan 5 bases of the 3′-end of the template binding region. In someembodiments, the mutation is a single nucleotide polymorphism (SNP). Insome embodiments, the mutation is a small insertion or deletion. In someembodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides areinserted or deleted.

In particular embodiments, the 3′ terminal region comprises a base thatoverlays the SNP locus. In some embodiments, the base is complementaryto a mutant allele of the SNP locus. In some embodiments, the base iscomplementary to a wild-type allele of the SNP locus. In someembodiments, the probe-binding region does not hybridize to any genomicsequence from the subject. In some embodiments, the polymerase is a DNApolymerase lacking 3′ to 5′ exonuclease activity.

The disclosure also provides a kit comprising: (a) an oligonucleotideprimer, wherein the oligonucleotide primer comprises (i) a probe-bindingregion and a template binding region that is at least 70% complementaryto a template nucleic acid suspected of harboring a mutation, wherein aportion of the template binding region at least partially overlays locusof the suspected mutation, wherein the oligonucleotide primer uponhybridization to the template nucleic acid is extendable by a polymeraseif the mutation is present but is not extendable by the polymerase ifthe mutation is not present; and (b) instructions for use. In someembodiments, the 3′-ultimate and/or penultimate bases of the primer havephosphorothioates linkages. In some embodiments, the mutation is asingle nucleotide polymorphism (SNP). In some embodiments, the templatebinding region comprises a 3′ terminal base that overlays the SNP locus.In some embodiments, the 3′ terminal base is complementary to a mutantallele of the SNP locus. In some embodiments, the 3′ terminal base iscomplementary to a wild-type allele of the SNP locus. In someembodiments, the probe-binding region does not hybridize to any genomicsequence from the subject. In some embodiments, the kit furthercomprises a reporter probe that is at least 70% complementary to theprobe binding region. In some embodiments, the reporter probe comprisesa detectable moiety and a quencher moiety, wherein the quencher moietysuppresses detection of the detectable moiety when the reporter probe isintact. In some embodiments, the kit further comprises a reverse primerthat is at least 70% complementary to a reverse complement sequencedownstream of the locus. In some embodiments, the kit further comprisesa polymerase.

In some embodiments, the polymerase is a thermostable polymerase havinga 5′ to 3′ exonuclease activity and not having a 3′ to 5′ exonucleaseactivity. In some embodiments, the polymerase is a thermostablepolymerase having 3′ to 5′ exonuclease activity. In some embodiments,the polymerase is a thermostable polymerase having 3′ to 5′ exonucleaseactivity and the 3′-ultimate and/or penultimate bases of the primer havephosphorothioates linkages. In some embodiments, the kit furthercomprises (i) one or more alternative oligonucleotide primers, whereinthe one or more alternative oligonucleotide primers each comprises adistinct probe binding region and a template binding region that is atleast 70% complementary to the template nucleic acid, wherein a portionof the template binding region at least partially overlays the locus,wherein the alternative oligonucleotide primer upon hybridization to thetemplate nucleic acid is extendable by a polymerase if an alternativeallele is present but is not extendable by the polymerase if thealternative allele is not present. In some embodiments, the kit furthercomprises one or more alternative reporter probes, wherein each of thealternative reporter probes is at least 70% complementary to one of thedistinct probe binding regions but not to any other probe binding regionof the kit. In some embodiments, each of the alternative reporter probescomprises an alternative detectable moiety and a quencher moiety,wherein each of the detectable moieties of the kit is detectablydistinct from any other detectable moiety of the kit. In someembodiments, a hybridization product consisting of the oligonucleotideprimer and reporter probe has a Tm that is at least 10 degrees higherthan a Tm of a hybridization product consisting of the oligonucleotideprimer and the template nucleic acid (see FIGS. 25-26). In anotherembodiment, the reporter probe has a T_(m) at least 5° C., at least 6°C., at least 7° C., at least 8° C., or at least 9° C., below thehybridization product of the primer and template. In another embodiment,the reporter probe has a T_(m) at least 10° C. below the hybridizationproduct of the primer and template (see FIG. 35).

In another aspect, the disclosure provides a method of detecting amutation in a target polynucleotide region, comprising: (a) selectivelyhybridizing an oligonucleotide primer to the target polynucleotideregion, wherein the oligonucleotide primer comprises (i) a probe-bindingregion, and (ii) a template binding region that is at least 70%complementary to a template nucleic acid, for example a template nucleicacid suspected of harboring a mutation, wherein a portion of thetemplate binding region at least partially overlays a locus of thesuspected mutation, and wherein the oligonucleotide primer uponhybridization to the template nucleic acid is extendable by a polymeraseif the mutation is present but is not extendable by the polymerase ifthe mutation is not present; (b) extending the hybridizedoligonucleotide primer to form an extension product; and (c) detectingthe extension product, whereby the detecting indicates the presence ofthe mutation. In some embodiments, extending comprises extending with aDNA polymerase that does not comprise 3′ to 5′ exonuclease activity.

In some embodiments, detecting comprises selectively hybridizing areporter probe to the probe binding region. In some embodiments, thereporter probe comprises a detectable moiety and a quencher moiety,wherein the quencher moiety suppresses detection of the detectablemoiety when the reporter probe is intact. In some embodiments, detectingfurther comprises separating the detectable moiety from the quenchermoiety of the hybridized reporter probe. In some embodiments, the methodfurther comprises amplifying the extension product with a reverse primerthat is capable of hybridizing to a region of the extension productdownstream of the locus. In some embodiments, amplifying comprisesamplifying with a DNA polymerase that comprises 5′ to 3′ exonucleaseand/or endonucleolytic activity. In some embodiments, the method furthercomprises selectively hybridizing one or more alternativeoligonucleotide primers to the target polynucleotide region, wherein theone or more alternative oligonucleotide primers each comprises adistinct probe binding region and a template binding region that is atleast 70% complementary to the template nucleic acid, wherein a portionof the template binding region at least partially overlays the locus,wherein the alternative oligonucleotide primer upon hybridization to thetemplate nucleic acid is extendable by a polymerase if an alternativeallele is present but is not extendable by the polymerase if thealternative allele is not present. In some embodiments, detectingfurther comprises selectively hybridizing one or more alternativereporter probes to the one or more alternative oligonucleotide primers,wherein each of the alternative reporter probes is at least 70%complementary to one of the distinct probe binding regions but not toany other of the probe binding regions. In some embodiments, each of thealternative reporter probes comprises an alternative detectable moietyand a quencher moiety, wherein each of the alternative detectablemoieties is detectably distinct from any other of the detectablemoieties. In some embodiments, the mutation is a single nucleotidepolymorphism (SNP). In some embodiments, the template binding regioncomprises a 3′ terminal region comprising a base that overlays the SNPlocus. In some embodiments, wherein the base is complementary to amutant allele of the SNP locus.

In some embodiments, the base is complementary to a wild-type allele ofthe SNP locus. In some embodiments, the probe-binding region does nothybridize to the target polynucleotide region. In some embodiments, ahybridization product of the oligonucleotide primer and reporter probehas a Tm that is at least 10 degrees higher than a Tm of a hybridizationproduct between the oligonucleotide primer and target polynucleotide. Insome embodiments, a concentration of the reporter probe is at least 10×a concentration of the forward primer. In some embodiments, the nucleicacid sample is subdivided into a plurality of discrete reaction volumesprior to steps b-c. In some embodiments, the method further comprisesdetection of the detectable moiety in each of the reaction volumes. Insome embodiments, the method further comprises counting a number of thereaction volumes wherein the detectable moiety is detected. In someembodiments, the nucleic acid sample is subdivided such that theplurality of discrete reaction volumes contain an average of <1, 1, ormore than 1 template nucleic acid molecule. In some embodiments, themethod further comprises providing a conclusion and transmitting theconclusion over a network.

The disclosure also provides a composition comprising (a) anoligonucleotide primer hybridized to a template nucleic acid, whereinthe template nucleic acid comprises a wild-type allele at a locus,wherein the 3′ terminal region of the oligonucleotide primer overlaysthe locus and is not complementary to the wild-type allele; and (b) anintact reporter probe comprising a detectable and quencher moiety,wherein the intact reporter probe is hybridized to the oligonucleotideprimer.

The disclosure also provides a method, comprising: (a) hybridizing atarget-selective oligonucleotide (TSO) to a single-stranded DNA (ssDNA)fragment in an ssDNA library to create a hybridization product; and (b)extending the hybridization product to create a double strandedextension product, wherein the TSO comprises (i) a sequence that iscomplementary to a single target region and (ii) a first single-strandedadaptor sequence located at a first end of the TSO but not to both endsof the TSO, and wherein the ssDNA fragment comprises a secondsingle-stranded adaptor sequence but does not comprise the firstsingle-stranded adaptor sequence. In some embodiments, the ssDNAfragment is ligated to a second single-stranded adaptor sequence by aligation method comprising over 10%, 50%, 70%, or 90% ligationefficiency. In some embodiments, the ssDNA fragment is ligated to asecond single-stranded adaptor sequence by a single-stranded ligationmethod. In some embodiments, the second single-stranded adaptor sequenceis located at a first end of the ssDNA fragment but not at both ends ofthe ssDNA fragment. In some embodiments, the amplifying comprises linearamplification. In some embodiments, the second single-stranded adaptorsequence is located at a first end of the ssDNA fragment but not at bothends of the ssDNA fragment. In some embodiments, the first end of thessDNA fragment is a 5′ end. In some embodiments, the first adaptorsequence comprises a barcode sequence. In some cases, the barcodesequence is used to identify the sample source of the nucleic acid. Insome cases, the barcode sequence is used to identify independentligation events. In some cases, the single-stranded adaptors are apopulation of adaptors comprising a large number of distinct barcodesequences. In some cases, the number of distinct barcode sequences is inexcess of the number of ssDNA fragments from a given locus. In somecases, the distinct barcodes can be used to uniquely identify ssDNAfragments. In some embodiments, the first or second adaptor sequencecomprises a barcode sequence. In some embodiments, the first end of theTSO is a 5′ end. In some embodiments, the first or second adaptorsequence comprises a sequence that is at least 70% identical to asupport-bound oligonucleotide conjugated to a solid support. In someembodiments, the solid support is coupled to a sequencing platform. Insome embodiments, the first or second adaptor sequence comprises abinding site for a sequencing primer. In some embodiments, the methodfurther comprises annealing the extension products to the support-boundoligonucleotides. In some embodiments, the method further comprisesamplifying the annealed extension products. In some embodiments, themethod further comprises sequencing the annealed extension products. Insome embodiments, the ssDNA library comprises genomic DNA fragments. Insome embodiments, the ssDNA library comprises cDNA fragments. In someembodiments, the method further comprises removing unhybridized TSOs andunhybridized ssDNA library members. In some embodiments, steps (a) and(b) are performed when the ssDNA library members and the TSOs arefree-floating in a solution.

In some embodiments, the single target region flanks a genomic region.In some embodiments, the genomic region comprises a portion of an exonregion from a cancer-related gene. In some embodiments, thecancer-related gene is selected from the group consisting of ABCA1,BRAF, CHD5, EP300, FLT1, ITPA, MYC, PIK3R1, SKP2, TP53, ABCA7, BRCA1,CHEK1, EPHA3, FLT3, JAK1, MYCL1, PIK3R2, SLC19A1, TP73, ABCB1, BRCA2,CHEK2, EPHA5, FLT4, JAK2, MYCN, PKHD1, SLC1A6, TPM3, ABCC2, BRIP1, CLTC,EPHA6, FN1, JAK3, MYH2, PLCB1, SLC22A2, TPMT, ABCC3, BUB1B, COL1A1,EPHA7, FOS, JUN, MYH9, PLCG1, SLCO1B3, TPO, ABCC4, Clorf144, COPS5,EPHA8, FOXO1, KBTBD11, NAV3, PLCG2, SMAD2, TPR, ABCG2, CABLES1, CREB1,EPHB1, FOXO3, KDM6A, NBN, PML, SMAD3, TRIO, ABL1, CACNA2D1, CREBBP,EPHB4, FOXP4, KDR, NCOA2, PMS2, SMAD4, TRRAP, ABL2, CAMKV, CRKL, EPHB6,GAB 1, KIT, NEK11, PPARG, SMARCA4, TSC1, ACVR1B, CARD11, CRLF2, EPO,GATA1, KLF6, NF1, PPARGC1A, SMARCB1, TSC2, ACVR2A, CARM1, CSF1R, ERBB2,GLI1, KLHDC4, NF2, PPP1R3A, SMO, TTK, ADCY9, CAV1, CSMD3, ERBB3, GLI3,KRAS, NKX2-1, PPP2R1A, SOCS1, TYK2, AGAP2, CBFA2T3, CSNK1G2, ERBB4,GNA11, LMO2, NOS2, PPP2R1B, SOD2, TYMS, AKT1, CBL, CTNNA1, ERCC1, GNAQ,LRP1B, NOS3, PRKAA2, SOS1, UGT1A1, AKT2, CCND1, CTNNA2, ERCC2, GNAS,LRP2, NOTCH1, PRKCA, SOX10, UMPS, AKT3, CCND2, CTNNB1, ERCC3, GPR124,LRP6, NOTCH2, PRKCZ, SOX2, USP9X, ALK, CCND3, CYFIP1, ERCC4, GPR133,LTK, NOTCH3, PRKDC, SP1, VEGF, ANAPC5, CCNE1, CYLD, ERCC5, GRB2, MAN1B1,NPM1, PTCH1, SPRY2, VEGFA, APC, CD40LG, CYP19A1, ERCC6, GSK3B, MAP2K1,NQO1, PTCH2, SRC, VHL,APC2, CD44, CYP1B1, ERG, GSTP1, MAP2K2, NR3C1,PTEN, ST6GAL2, WRN, AR, CD79A, CYP2C19, ERN2, GUCY1A2, MAP2K4, NRAS,PTGS2, STAT1, WT1, ARAF, CD79B, CYP2C8, ESR1, HDAC1, MAP2K7, NRP2,PTPN11, STAT3, XPA, ARFRP1, CDC42, CYP2D6, ESR2, HDAC2, MAP3K1, NTRK1,PTPRB, STK11, XPC, ARID1A, CDC42BPB, CYP3A4, ETV4, HGF, MAPK1, NTRK2,PTPRD, SUFU, ZFY, ATM, CDC73, CYP3A5, EWSR1, HIF1A, MAPK3, NTRK3, RAD50,SULT1A1, ZNF521,ATP5A1, CDH1, DACH2, EXT1, HM13, MAPK8, OMA1, RAD51,SUZ12, ATR, CDH10, DCC, EZH2, HMGA1, MARK3, OR10R2, RAF1, TAF1, AURKA,CDH2, DCLK3, FANCA, HNF1A, MCL1, PAK3, RARA, TBX22, AURKB, CDH20, DDB2,FANCD2, HOXA3, MDM2, PARP1, RB1, TCF12, BAI3, CDH5, DDR2, FANCE, HOXA9,MDM4, PAX5, REM1, TCF3, BAP1, CDK2, DGKB, FANCF, HRAS, MECOM, PCDH15,RET, TCF4, BARD1, CDK4, DGKZ, FAS, HSP90AA1, MEN1, PCDH18, RICTOR, TEK,BAX, CDK6, DIRAS3, FBXW7, IDH1, MET, PCNA, RIPK1, TEP1, BCL11A, CDK7,DLG3, FCGR3A, IDH2, MITF, PDGFA, ROR1, TERT, BCL2, CDK8, DLL1, FES,IFNG, MLH1, PDGFB, ROR2, TET2, BCL2A1, CDKN1A, DNMT1, FGFR1, IGF1R, MLL,PDGFRA, ROS1, TGFBR2, BCL2L1, CDKN1B, DNMT3A, FGFR2, IGF2R, MLL3,PDGFRB, RPS6KA2, THBS1, BCL2L2, CDKN2A, DNMT3B, FGFR3, IKBKE, MPL,PDZRN3, RPTOR, TNFAIP3, BCL3, CDKN2B, DOT1L, FGFR4, IKZF1, MRE11A,PHLPP2, RSPO2, TNKS, BCL6, CDKN2C, DPYD, FH, IL2RG, MSH2, PIK3C3, RSPO3,TNKS2, BCR, CDKN2D, E2F1, FHOD3, INHBA, MSH6, PIK3CA, RUNX1, TNNI3K,BIRC5, CDX2, EED, FIGF, INSR, MTHFR, PIK3CB, SDHB, TNR, BIRC6, CEBPA,EGF, FLG2, IRS1, MTOR, PIK3CD, SF3B1, TOP1, BLM, CERK, EGFR, FLNC, IRS2,MUTYH, PIK3CG, SHC1, and TOP2A.

In some embodiments, the ligation method with over 10%, 50%, 70%, or 90%efficiency is a single-stranded ligation method. In some embodiments,the ligation method comprises uses of an RNA ligase. In someembodiments, the RNA ligase is CircLigase or CircLigase II. Thedisclosure also provides a method of preparing a single-stranded DNAlibrary, comprising: (a) denaturing a double stranded DNA fragment intosingle stranded DNA (ssDNA) fragments and, optionally, excising damagedbases (b) removing 5′ phosphates from the ssDNA fragments; (c) ligatingsingle-stranded primer docking oligonucleotides (pdo's) to 3′ ends ofthe ssDNA fragments, (d) hybridizing primers to the pdo's, wherein theprimers comprise a sequence complementary to the adaptor oligonucleotidesequence and comprise a first adaptor sequence that is at least 70%identical to a support-bound oligonucleotide coupled to a sequencingplatform; (e) extending the hybridized primers to create duplexes,wherein each duplex comprises an ss fragment and an extended primerstrand; (f) denaturing the double-stranded extension product, whereinthe denaturing results in release of the extended primer strands fromthe immobilized capturing reagent and retention of the ssDNA fragmentson the immobilized capturing reagent; and (g) collecting the extendedprimer strands. In some embodiments, the method comprises repeatingsteps d-f in a linear amplification reaction, wherein the extendedprimer strands comprise the ss DNA library. In some embodiments, step(c) results in ligation of at least 50% of the ssDNA fragments to thepdo's. In some embodiments, the ligating is performed using anATP-dependent ligase. In some embodiments, the ATP-dependent ligase isan RNA ligase. In some embodiments, the RNA ligase is CircLigase orCircLigase II. In some embodiments, the pdo's are adenylated. In someembodiments, the extending is performed using a proofreading DNApolymerase. In some embodiments, damaged bases can include oxidation andabasic sites. In some cases the original base is a purine, and thedamaged bases are removed by formamidopyrimidine [fapy]-DNA glycosylase.In some embodiments, the original base is a pyrimidines, and the damagedbases are removed by Endonuclease VIII. In some cases, the original baseis cytosine that has been deaminated to produce uracil, and the damagedbases are removed by uracil deglycosylase. In some embodiments, damagedbases can be removed from double stranded DNA or single stranded DNA.

The disclosure also provides a method of preparing a single-stranded DNAlibrary, comprising: denaturing a double stranded DNA fragment intosingle stranded DNA (ssDNA) fragments; optionally, excising any damagedbases; ligating a first single-stranded adaptor sequence to a first endof the ssDNA fragments; and ligating a second single-stranded adaptorsequence to a second end of the ssDNA fragments. In some embodiments,damaged bases can include oxidation and abasic sites. In some cases theoriginal base is a purine, and the damaged bases are removed byformamidopyrimidine [fapy]-DNA glycosylase. In some embodiments, theoriginal base is a pyrimidines, and the damaged bases are removed byEndonuclease VIII. In some cases, the original base is cytosine that hasbeen deaminated to produce uracil, and the damaged bases are removed byuracil deglycosylase.

The disclosure also provides a kit, comprising: a primer dockingoligonucleotide (pdo); a primer, wherein the primer comprises a sequencethat is at least 70% complementary to the pdo sequence and furthercomprises a first adaptor sequence that is at least 70% identical to afirst support-bound oligonucleotide coupled to a sequencing platform;and instructions for use. In some embodiments, the kit includes enzymesused to excise any damaged bases, where such can include oxidation andabasic sites. In some cases the original base is a purine, and the kitcomprises formamidopyrimidine [fapy]-DNA glycosylase. In someembodiments, the original base is a pyrimidines, the kit comprisesEndonuclease VIII. In some cases, the original base is cytosine that hasbeen deaminated to produce uracil, and the kit comprises uracildeglycosylase.

In some embodiments, the kit further comprises an ATP-dependent ligase.In some embodiments, the ATP-dependent ligase is an RNA ligase. In someembodiments, the RNA ligase is CircLigase or CircLigase II. In someembodiments, the kit further comprises a proofreading DNA polymerase. Insome embodiments, the kit further comprises the immobilized capturingreagent. In some embodiments, the first adaptor sequence comprises asequence that is at least 70% complementary to a first sequencingprimer. In some embodiments, the first adaptor sequence comprises abarcode sequence. In some cases, the barcode sequence is used toidentify the sample source of the nucleic acid. In some cases, thebarcode sequence is used to identify independent ligation events. Insome cases, the single-stranded adaptors are a population of adaptorscomprising a large number of distinct barcode sequences. In some cases,the number of distinct barcode sequences is in excess of the number ofssDNA fragments from a given locus. In some cases, the distinct barcodescan be used to uniquely identify ssDNA fragments. In some embodiments,the kit further comprises a target-selective oligonucleotide (TSO). Insome embodiments, the TSO further comprises a second adaptor sequencelocated at a first end of the TSO but not a second end of the TSO. Insome embodiments, the first end of the TSO is a 5′ end. In someembodiments, the second adaptor sequence comprises a sequence that is atleast 70% identical to a second support-bound oligonucleotide coupled toa sequencing platform. In some embodiments, the second adaptor sequencecomprises a binding site for a sequencing primer.

The disclosure also provides a kit, comprising: a first adaptoroligonucleotide, wherein the first adaptor comprises a sequence that isat least 70% complementary to a first support-bound oligonucleotidecoupled to a sequencing platform; a second adaptor oligonucleotide,wherein the second adaptor comprises a sequence that is distinct fromthe first adaptor oligonucleotide; an RNA ligase; repair enzymes; andinstructions for use. In some embodiments, the second adaptor comprisesa sequence that is at least 70% complementary to a sequencing primer. Insome embodiments, the second adaptor comprises a sequence that is atleast 70% complementary to a second support-bound oligonucleotidecoupled to a sequencing platform. In some embodiments, the first adaptorcomprises a sequence that is at least 70% complementary to a sequencingprimer. In some embodiments, one of the first or second adaptorcomprises a barcode sequence. In some embodiments, the first adaptorcomprises a 3′ terminal blocking group that prevents the formation of acovalent bond between the 3′ terminal base and another nucleotide. Insome embodiments, the 3′ terminal blocking group is dideoxy-dNTP, alkyl,amino-alkyl, fluorophore digeoxygenin, or biotin. In some embodiments,the first adaptor comprises a 5′ polyadenylation sequence. In someembodiments, the RNA ligase is truncated or mutated ligase 2 from T4 orMth. In some embodiments, the kit further comprises a second RNA ligase.In some embodiments, the second RNA ligase is CircLigase or CircLigaseII.

The disclosure provides methods and kits for conducting ahigh-efficiency ligation reaction. Such methods and kits can be used fora wide range of applications.

The disclosure provides a method of conducting a high-efficiencyligation reaction, comprising ligating a plurality of acceptor nucleicacid molecules to a first end of at least 10%, 20%, 30%, 40%, 50%, 60%,70%, 80%, or 90% of a plurality of donor nucleic acid molecules. In someembodiments, the plurality of donor nucleic acid molecules is present ina reaction mixture at a concentration of >10 nM. In some embodiments,the plurality of donor nucleic acid molecules is present in a reactionmixture at a concentration of >1 nM.

In another aspect, the disclosure provides a method of conducting ahigh-efficiency ligation reaction, comprising ligating a plurality ofacceptor nucleic acid molecules to a first end of over 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, or 90% of a plurality of donor nucleic acidmolecules, wherein one of the donor or acceptor nucleic acid moleculesis >120 nt long.

In another aspect, the disclosure provides a method of conducting ahigh-efficiency ligation reaction, comprising ligating a plurality ofdonor nucleic acid molecules to a first end of at least 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, or 90% of a plurality of acceptor nucleic acidmolecules. In some embodiments, the plurality of donor nucleic acidmolecules is present in a reaction mixture at a concentration of >10 nM.In some embodiments, the plurality of donor nucleic acid molecules ispresent in a reaction mixture at a concentration of >1 nM.

In another aspect, the disclosure provides a method of conducting ahigh-efficiency ligation reaction, comprising ligating a plurality ofdonor nucleic acid molecules to a first end of at least 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, or 90% of a plurality of acceptor nucleic acidmolecules, wherein one of the donor or acceptor nucleic acid moleculesis >120 nt long.

In some embodiments of the high efficiency ligation methods, theacceptor nucleic acid molecules are the donor nucleic acid molecules. Insome embodiments, the method comprises (a) transferring a nucleosidemonophosphate (NMP) to an amount of a donor nucleic acid molecules in areaction mixture for a time sufficient to effect an accumulation ofNMP-carrying donor nucleic acid molecules; and (b) effecting formationof a covalent bond between an NMP-carrying donor nucleic acid moleculesand an acceptor nucleic acid molecule, wherein steps (a) and (b) arecarried out sequentially in the reaction mixture. In some embodiments,the transferring results in transfer of an NMP to at least 10%, 20%,30%, 40%, 50%, 60%, 70%, 80%, or 90% of the donor nucleic acidmolecules. In some embodiments, a 3′ terminal region of at least onemember of the donor nucleic acid molecules is an unmodified 3′ terminalregion. In some embodiments, the reaction mixture comprises (a) anamount of an nucleoside triphosphate (NTP)-dependent ligase that is atleast equimolar to the amount of donor nucleic acid molecules; and (b)NTP that is present in an amount that is at least 10-fold higher than aMichaelis constant (Km) of the NTP-dependent ligase. In someembodiments, the NTP-dependent ligase is an RNA ligase. In someembodiments the NTP-dependent ligase is an ATP-dependent RNA ligase. Insome embodiments, the RNA ligase is a thermophilic RNA ligase. In someembodiments, the RNA ligase is T4 RNA ligase. In some embodiments, theATP-dependent RNA ligase is MthRnl, CircLigase, or CircLigase II. Insome embodiments the NTP-dependent ligase is a GTP-dependent ligase,e.g., is RTcB. In some embodiments, a 3′ terminal region of a donornucleic acid molecule is modified with a 3′ terminal blocking group. Insome embodiments, wherein effecting formation of a covalent bondcomprises adding to the reaction mixture: the acceptor nucleic acidmolecule; and Mn²⁺ In some embodiments, the Mn²⁺ is present in an amountthat is at least 2.5 mM. In some embodiments, the Mn²⁺ is present in anamount that is about 5 mM. In some embodiments, the Mn²⁺ is present inan amount that is about 2.5 mM to about 7.5 mM. In some embodiments, themethod further comprises reducing concentration of the NTP in thereaction mixture. In some embodiments, reducing concentration comprisesreducing concentration of the NTP by at least 1.5 fold, 2-fold, 3-fold,4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or 10-fold. In someembodiments, reducing concentration comprises adding to the reactionmixture an amount of liquid sufficient to dilute the NTP at least 1.5fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or10-fold. In some embodiments, reducing concentration comprisessedimenting the components of the reaction mixture through high speedcentrifugation prior to adding an amount of liquid sufficient to dilutethe NTP at least 1.5 fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold,7-fold, 8-fold, 9-fold, or 10-fold. In some embodiments, the donornucleic acid molecules comprise nucleic acid molecules isolated from abiological source and wherein the acceptor nucleic acid moleculescomprise an adaptor sequence. In some embodiments, the acceptor nucleicacid molecules comprise nucleic acid isolated from a biological subjectand wherein the donor nucleic acid molecules comprise an adaptorsequence. In some embodiments, the acceptor nucleic acid moleculescomprise nucleic acid isolated from a biological subject and wherein thedonor nucleic acid molecules comprise a barcode sequence. In someembodiments, the donor nucleic acid molecules comprise nucleic acidisolated from a biological subject and wherein the acceptor nucleic acidmolecules comprise a barcode sequence. In some embodiments, the acceptornucleic acid molecules or donor nucleic acid molecules comprise adetectable tag. In some embodiments, the NMP is AMP. In someembodiments, the NMP is GMP. In some embodiments, the NTP is ATP. Insome embodiments, the NTP is GTP.

In another aspect, the disclosure provides a method of preparing anucleic acid library, comprising ligating an oligonucleotide sequence toa first end of at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90%of a plurality of template nucleic acid molecules to create the nucleicacid library, wherein one of the template nucleic acid molecules is >120nt long. In some embodiments, the oligonucleotide sequence is an adaptorsequence. In some embodiments, the method further comprises sequencingthe nucleic acid library. In some embodiments, the oligonucleotidesequence comprises a detectable label. In some embodiments, the methodcomprises analyzing the nucleic acid library by array hybridization.

In one aspect, the disclosure provides a method of preparing a nucleicacid library, comprising (a) ligating an adaptor sequence to a first endof at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of aplurality of template nucleic acid molecules to create the nucleic acidlibrary; and (b) sequencing the nucleic acid library. In someembodiments, sequencing is performed without pre-amplification of thenucleic acid library. In some embodiments, the plurality of templatenucleic acid molecules comprises genomic DNA (gDNA). In someembodiments, the gDNA is isolated from a solid tissue sample. In someembodiments, the gDNA is isolated from plasma, serum, sputum, saliva,urine, or sweat. In some embodiments, the plurality of template nucleicacid molecules comprises single-stranded nucleic acid fragments. In someembodiments, the method comprises ligating an adaptor sequence to afirst end of at least 50%, 60%, 70%, 80%, 90%, and 95% of the pluralityof template nucleic acid molecules.

In some embodiments, the ligating comprises the steps of: (a)transferring a NMP to an amount of a first population of nucleic acids(reactant 1) in a first reaction mixture for a time sufficient to effectan accumulation of NMP-carrying reactant 1; and (b) effecting formationof a covalent bond between the NMP-carrying reactant 1 and a secondpopulation of nucleic acids (reactant 2), wherein the reactant 1 iseither (i) the plurality of template nucleic acids or (ii) thesequencing adaptor, wherein the reactant 2 is the other of (i) theplurality of template nucleic acids or (ii) the sequencing adaptor, andwherein the adenylated reactant 1 is not purified prior to the effectingformation of a covalent bond. In some embodiments, the transferringresults in transfer of NMP to at least 10%, 20%, 30%, 40%, 50%, 60%,70%, 80%, or 90% of reactant 1. In some embodiments, a 3′ terminalregion of at least one member of the reactant 1 is an unmodified 3′terminal region. In some embodiments, the first reaction mixturecomprises (a) an amount of an NTP-dependent ligase that is at leastequimolar to the amount of reactant 1; and (b) NTP that is present in anamount that is at least 10-fold higher than a Michaelis constant (Km) ofthe NTP-dependent ligase. The NTP-dependent ligase can be any of theforegoing NTP-dependent ligases. In some embodiments, the NTP-dependentligase is an RNA ligase. In some embodiments, the RNA ligase is athermophilic RNA ligase. In some embodiments the NTP dependent ligase isan ATP dependent RNA ligase. In some embodiments the ATP dependent RNAligase is MthRnl, T4 RNA ligase, CircLigase, or CircLigase II. In someembodiments, the NTP-dependent ligase is a GTP dependent ligase. TheGTP-dependent ligase can be RtcB. In some embodiments, a 3′ terminalregion of at least one member of reactant 1 is modified with a 3′terminal blocking group. In some embodiments, effecting formation of acovalent bond comprises adding to the first reaction mixture: a cation;the reactant 2; and a liquid in an amount sufficient to dilute the NTPat least 10-fold. In some embodiments, the cation is Mn²⁺. In someembodiments, the Mn²⁺ is present in an amount that is at least 2.5 mM.In some embodiments, the Mn²⁺ is present in an amount that is about 5mM. In some embodiments, the Mn²⁺ is present in an amount that is about2.5 mM to about 7 mM. In some embodiments, the method further comprisesligating a second adaptor sequence to a second end of at least 10%, 20%,30%, 40%, 50%, 60%, 70%, 80%, or 90% of the plurality of templatenucleic acid molecules. In some embodiments, the method furthercomprises (a) hybridizing a target-selective oligonucleotide (tso) to amember of the DNA library, wherein the target-selective oligonucleotidecomprises (i) a sequence specific for a region of gDNA and (ii) a secondadaptor sequence; and (b) extending the hybridized tso to create adouble-stranded library member comprising the first and second adaptor.In some embodiments, the tso comprises a sequence having at least 70%identity or complementarity to a region of a cancer-related gene. Insome embodiments, the sequencing comprises massively parallelsequencing. In some embodiments, the ligating is performed using areaction protocol that can be performed in less than 3 hours.

In another aspect, the disclosure provides kits for performing a highefficiency ligation. In some embodiments, the kit comprises anNTP-dependent ligase; a cation; NTP; and instructions for carrying outany of the methods described herein.

The disclosure also provides a method of tracking tumor-specific somaticmutations using tumor genomic DNA (gDNA) isolated from a subject's tumorand normal gDNA isolated from non-tumor tissue from the subject;comprising: (a) sequencing a DNA library prepared from the tumor gDNAwithout pre-amplification to produce a first dataset; (b) sequencing aDNA library prepared from the normal gDNA without pre-amplification toproduce a second dataset; (c) analyzing the first and second dataset toidentify one or more tumor-specific somatic mutations in the subject;and (d) detecting the presence or absence of the tumor-specific somaticmutations in cell-free DNA isolated from a liquid sample from thesubject. In some embodiments, the liquid sample is selected from thegroup consisting of plasma, serum, sputum, saliva, urine, cerebralspinal fluid, mucosal secretions, amniotic fluid, bodily fluid andsweat. In some embodiments, the DNA library of step (a) or (b) isprepared using any of the methods described herein. In some embodiments,the sequencing comprises sequencing at least 200 cancer-related In someembodiments, the cancer-related genes are selected from the groupconsisting of ABCA1, BRAF, CHD5, EP300, FLT1, ITPA, MYC, PIK3R1, SKP2,TP53, ABCA7, BRCA1, CHEK1, EPHA3, FLT3, JAK1, MYCL1, PIK3R2, SLC19A1,TP73, ABCB1, BRCA2, CHEK2, EPHA5, FLT4, JAK2, MYCN, PKHD1, SLC1A6, TPM3,ABCC2, BRIP1, CLTC, EPHA6, FN1, JAK3, MYH2, PLCB1, SLC22A2, TPMT, ABCC3,BUB1B, COL1A1, EPHA7, FOS, JUN, MYH9, PLCG1, SLCO1B3, TPO, ABCC4,Clorf144, COPS5, EPHA8, FOXO1, KBTBD11, NAV3, PLCG2, SMAD2, TPR, ABCG2,CABLES1, CREB1, EPHB1, FOXO3, KDM6A, NBN, PML, SMAD3, TRIO, ABL1,CACNA2D1, CREBBP, EPHB4, FOXP4, KDR, NCOA2, PMS2, SMAD4, TRRAP, ABL2,CAMKV, CRKL, EPHB6, GAB1, KIT, NEK11, PPARG, SMARCA4, TSC1, ACVR1B,CARD11, CRLF2, EPO, GATA1, KLF6, NF1, PPARGC1A, SMARCB1, TSC2, ACVR2A,CARM1, CSF1R, ERBB2, GLI1, KLHDC4, NF2, PPP1R3A, SMO, TTK, ADCY9, CAV1,CSMD3, ERBB3, GLI3, KRAS, NKX2-1, PPP2R1A, SOCS1, TYK2, AGAP2, CBFA2T3,CSNK1G2, ERBB4, GNA11, LMO2, NOS2, PPP2R1B, SOD2, TYMS, AKT1, CBL,CTNNA1, ERCC1, GNAQ, LRP1B, NOS3, PRKAA2, SOS1, UGT1A1, AKT2, CCND1,CTNNA2, ERCC2, GNAS, LRP2, NOTCH1, PRKCA, SOX10, UMPS, AKT3, CCND2,CTNNB1, ERCC3, GPR124, LRP6, NOTCH2, PRKCZ, SOX2, USP9X, ALK, CCND3,CYFIP1, ERCC4, GPR133, LTK, NOTCH3, PRKDC, SP1, VEGF, ANAPC5, CCNE1,CYLD, ERCC5, GRB2, MAN1B1, NPM1, PTCH1, SPRY2, VEGFA, APC, CD40LG,CYP19A1, ERCC6, GSK3B, MAP2K1, NQO1, PTCH2, SRC, VHL,APC2, CD44, CYP1B1,ERG, GSTP1, MAP2K2, NR3C1, PTEN, ST6GAL2, WRN, AR, CD79A, CYP2C19, ERN2,GUCY1A2, MAP2K4, NRAS, PTGS2, STAT1, WT1, ARAF, CD79B, CYP2C8, ESR1,HDAC1, MAP2K7, NRP2, PTPN11, STAT3, XPA, ARFRP1, CDC42, CYP2D6, ESR2,HDAC2, MAP3K1, NTRK1, PTPRB, STK11, XPC, ARID1A, CDC42BPB, CYP3A4, ETV4,HGF, MAPK1, NTRK2, PTPRD, SUFU, ZFY, ATM, CDC73, CYP3A5, EWSR1, HIF1A,MAPK3, NTRK3, RAD50, SULT1A1, ZNF521,ATP5A1, CDH1, DACH2, EXT1, HM13,MAPK8, OMA1, RAD51, SUZ12, ATR, CDH10, DCC, EZH2, HMGA1, MARK3, OR10R2,RAF1, TAF1, AURKA, CDH2, DCLK3, FANCA, HNF1A, MCL1, PAK3, RARA, TBX22,AURKB, CDH20, DDB2, FANCD2, HOXA3, MDM2, PARP1, RB1, TCF12, BAI3, CDH5,DDR2, FANCE, HOXA9, MDM4, PAX5, REM1, TCF3, BAP1, CDK2, DGKB, FANCF,HRAS, MECOM, PCDH15, RET, TCF4, BARD1, CDK4, DGKZ, FAS, HSP90AA1, MEN1,PCDH18, RICTOR, TEK, BAX, CDK6, DIRAS3, FBXW7, IDH1, MET, PCNA, RIPK1,TEP1, BCL11A, CDK7, DLG3, FCGR3A, IDH2, MITF, PDGFA, ROR1, TERT, BCL2,CDK8, DLL1, FES, IFNG, MLH1, PDGFB, ROR2, TET2, BCL2A1, CDKN1A, DNMT1,FGFR1, IGF1R, MLL, PDGFRA, ROS1, TGFBR2, BCL2L1, CDKN1B, DNMT3A, FGFR2,IGF2R, MLL3, PDGFRB, RPS6KA2, THBS1, BCL2L2, CDKN2A, DNMT3B, FGFR3,IKBKE, MPL, PDZRN3, RPTOR, TNFAIP3, BCL3, CDKN2B, DOT1L, FGFR4, IKZF1,MRE11A, PHLPP2, RSPO2, TNKS, BCL6, CDKN2C, DPYD, FH, IL2RG, MSH2,PIK3C3, RSPO3, TNKS2, BCR, CDKN2D, E2F1, FHOD3, INHBA, MSH6, PIK3CA,RUNX1, TNNI3K, BIRC5, CDX2, EED, FIGF, INSR, MTHFR, PIK3CB, SDHB, TNR,BIRC6, CEBPA, EGF, FLG2, IRS1, MTOR, PIK3CD, SF3B1, TOP1, BLM, CERK,EGFR, FLNC, IRS2, MUTYH, PIK3CG, SHC1, and TOP2A.

In some embodiments, the method further comprises generating a reportcommunicating a profile of the tumor-specific mutations. In someembodiments, detecting the presence or absence of the tumor-specificmutations in cell-free DNA isolated from a liquid sample from thesubject is performed at a plurality of time points. In some embodiments,one time point is prior to a first administration of a cancer therapyand a second time point is subsequent to the first administration. Insome embodiments, the method further comprises generating a reportcommunicating the profile of tumor-specific mutations at the pluralityof time points. In some embodiments, the report comprises a list of oneor more therapeutic candidates targeting a gene that harbors one of thetumor-specific mutations. In some embodiments, the report is generated 1week from isolating the gDNA. In some embodiments, the mutationscomprise copy number variation. In some embodiments, the detectingcomprises sequencing the cell-free DNA. In some embodiments, the methodcomprises sequencing at least 10 cancer-related genes present in thecell-free DNA, wherein one of the at least 10 cancer-related genes isidentified as harboring a tumor-specific mutation. In some embodiments,the method comprises sequencing at least 100 cancer-related genespresent in the cell-free DNA, wherein one of the at least 100cancer-related genes is identified as harboring a tumor-specificmutation. In some embodiments, sequencing comprises sequencing by any ofthe methods described herein.

In some aspects, the disclosure provides an oligonucleotide probe with alow melting temperature (Tm), e.g., a low Tm probe, comprising: adetectable moiety; a quencher moiety; and a melting temperature (Tm)below 50° C. In some embodiments, the low Tm probe has a length of 8-30nucleotides. In some embodiments, the detectable moiety is quenched at atemperature of 55° C. or higher. In some embodiments, the detectablemoiety is quenched if the temperature is sufficiently low that the probeoccupies a conformational state such that the distance between thequencher and detectable moiety is less than the Forster radius, but athigh temperature is no longer efficiently quenched because of theincrease in configurational entropy as the average distance between thedetectable moiety and quencher exceeds said the Forster radius. In someembodiments, the low Tm probe does not hybridize to a complementarytemplate nucleic acid at an ambient temperature above 55° C. In someembodiments, the quencher moiety quenches the detectable moiety if theprobe is not hybridized to a template strand. In some embodiments, theTm of the low Tm probe is between 30-45° C. In some embodiments, thefluorophore moiety and quencher moiety low Tm probe are spaced at leastseven nucleotides apart. In some embodiments, the low Tm probe comprisesa nucleotide with a Tm enhancing base. In some embodiments thenucleotide with a Tm enhancing base is a Superbase, locked nucleotide,or bridge nucleotide. In some embodiments, the detectable moiety of thelow Tm probe comprises a fluorophore.

In some embodiments, the low Tm probe has a length of at least 15nucleotides. In some embodiments, the low Tm probe has a GC content ofat least 40%. In some embodiments, the low Tm probe has a GC contentthat is less than 80%. In some embodiments, the low Tm probe has a GCcontent that is less than 50%. In some embodiments, the low Tm probe hasa GC content that is less than 40%.

In some embodiments, the low Tm probe has a length of less than 15nucleotides. In some embodiments, the low Tm probe has a GC content ofless than 40%. In some embodiments, the low Tm probe has a GC contentthat is at least 40%. In some embodiments, the low Tm probe has a GCcontent that is between 40-80%. In some embodiments, the low Tm probehas a GC content of less than 40%, and further comprising a superbase, alocked or bridged nucleotide.

In some embodiments, the low Tm probe comprises a sequence having atleast 70% complementarity or identity to a nucleotide sequence of atleast 10 contiguous nucleotides contained in a gene selected from thegroup consisting of ABCA1, BRAF, CHD5, EP300, FLT1, ITPA, MYC, PIK3R1,SKP2, TP53, ABCA7, BRCA1, CHEK1, EPHA3, FLT3, JAK1, MYCL1, PIK3R2,SLC19A1, TP73, ABCB1, BRCA2, CHEK2, EPHA5, FLT4, JAK2, MYCN, PKHD1,SLC1A6, TPM3, ABCC2, BRIP1, CLTC, EPHA6, FN1, JAK3, MYH2, PLCB1,SLC22A2, TPMT, ABCC3, BUB1B, COL1A1, EPHA7, FOS, JUN, MYH9, PLCG1,SLCO1B3, TPO, ABCC4, Clorf144, COPS5, EPHA8, FOXO1, KBTBD11, NAV3,PLCG2, SMAD2, TPR, ABCG2, CABLES1, CREB1, EPHB1, FOXO3, KDM6A, NBN, PML,SMAD3, TRIO, ABL1, CACNA2D1, CREBBP, EPHB4, FOXP4, KDR, NCOA2, PMS2,SMAD4, TRRAP, ABL2, CAMKV, CRKL, EPHB6, GAB1, KIT, NEK11, PPARG,SMARCA4, TSC1, ACVR1B, CARD11, CRLF2, EPO, GATA1, KLF6, NF1, PPARGC1A,SMARCB1, TSC2, ACVR2A, CARM1, CSF1R, ERBB2, GLI1, KLHDC4, NF2, PPP1R3A,SMO, TTK, ADCY9, CAV1, CSMD3, ERBB3, GLI3, KRAS, NKX2-1, PPP2R1A, SOCS1,TYK2, AGAP2, CBFA2T3, CSNK1G2, ERBB4, GNA11, LMO2, NOS2, PPP2R1B, SOD2,TYMS, AKT1, CBL, CTNNA1, ERCC1, GNAQ, LRP1B, NOS3, PRKAA2, SOS1, UGT1A1,AKT2, CCND1, CTNNA2, ERCC2, GNAS, LRP2, NOTCH1, PRKCA, SOX10, UMPS,AKT3, CCND2, CTNNB1, ERCC3, GPR124, LRP6, NOTCH2, PRKCZ, SOX2, USP9X,ALK, CCND3, CYFIP1, ERCC4, GPR133, LTK, NOTCH3, PRKDC, SP1, VEGF,ANAPC5, CCNE1, CYLD, ERCC5, GRB2, MAN1B1, NPM1, PTCH1, SPRY2, VEGFA,APC, CD40LG, CYP19A1, ERCC6, GSK3B, MAP2K1, NQO1, PTCH2, SRC, VHL,APC2,CD44, CYP1B1, ERG, GSTP1, MAP2K2, NR3C1, PTEN, ST6GAL2, WRN, AR, CD79A,CYP2C19, ERN2, GUCY1A2, MAP2K4, NRAS, PTGS2, STAT1, WT1, ARAF, CD79B,CYP2C8, ESR1, HDAC1, MAP2K7, NRP2, PTPN11, STAT3, XPA, ARFRP1, CDC42,CYP2D6, ESR2, HDAC2, MAP3K1, NTRK1, PTPRB, STK11, XPC, ARID1A, CDC42BPB,CYP3A4, ETV4, HGF, MAPK1, NTRK2, PTPRD, SUFU, ZFY, ATM, CDC73, CYP3A5,EWSR1, HIF1A, MAPK3, NTRK3, RAD50, SULT1A1, ZNF521,ATP5A1, CDH1, DACH2,EXT1, HM13, MAPK8, OMA1, RAD51, SUZ12, ATR, CDH10, DCC, EZH2, HMGA1,MARK3, OR10R2, RAF1, TAF1, AURKA, CDH2, DCLK3, FANCA, HNF1A, MCL1, PAK3,RARA, TBX22, AURKB, CDH20, DDB2, FANCD2, HOXA3, MDM2, PARP1, RB1, TCF12,BAI3, CDH5, DDR2, FANCE, HOXA9, MDM4, PAX5, REM1, TCF3, BAP1, CDK2,DGKB, FANCF, HRAS, MECOM, PCDH15, RET, TCF4, BARD1, CDK4, DGKZ, FAS,HSP90AA1, MEN1, PCDH18, RICTOR, TEK, BAX, CDK6, DIRAS3, FBXW7, IDH1,MET, PCNA, RIPK1, TEP1, BCL11A, CDK7, DLG3, FCGR3A, IDH2, MITF, PDGFA,ROR1, TERT, BCL2, CDK8, DLL1, FES, IFNG, MLH1, PDGFB, ROR2, TET2,BCL2A1, CDKN1A, DNMT1, FGFR1, IGF1R, MLL, PDGFRA, ROS1, TGFBR2, BCL2L1,CDKN1B, DNMT3A, FGFR2, IGF2R, MLL3, PDGFRB, RPS6KA2, THBS1, BCL2L2,CDKN2A, DNMT3B, FGFR3, IKBKE, MPL, PDZRN3, RPTOR, TNFAIP3, BCL3, CDKN2B,DOT1L, FGFR4, IKZF1, MRE11A, PHLPP2, RSPO2, TNKS, BCL6, CDKN2C, DPYD,FH, IL2RG, MSH2, PIK3C3, RSPO3, TNKS2, BCR, CDKN2D, E2F1, FHOD3, INHBA,MSH6, PIK3CA, RUNX1, TNNI3K, BIRC5, CDX2, EED, FIGF, INSR, MTHFR, HADH,RPP30, ZFP3, PIK3CB, SDHB, TNR, BIRC6, CEBPA, EGF, FLG2, IRS1, MTOR,PIK3CD, SF3B1, TOP1, BLM, CERK, EGFR, FLNC, IRS2, MUTYH, PIK3CG, SHC1,and TOP2A.

In some aspects, the disclosure also provides a reaction mixturecomprising at least one primer/probe set, wherein the primer/probe setcomprises: a forward primer designed to hybridize to a genomic region ata first location; and a low Tm probe as described herein. In someembodiments, the reaction mixture further comprises a reverse primerdesigned to hybridize to the genomic region at a second location. Insome embodiments, the low Tm probe has a Tm that is at least 15° C.lower than the Tm of the forward primer. In some embodiments, the low Tmprobe has a Tm that is at least 15° C. lower than an average of the Tmof the first primer and the Tm of the second primer. In someembodiments, the low Tm probe is designed to hybridize to the genomicregion at a third location located between the first and secondlocation. In some embodiments the reverse primer is present in an amountthat is at least 2 to 10-fold less than an amount of the forward primer.In some embodiments the reverse primer is present in an amount that isno more than 2-fold different than an amount of the forward primer.

In some embodiments, the reaction mixture further comprises a nucleicacid sample isolated from a biological sample. In some embodiments, thebiological sample is a sample isolated from a subject. In someembodiments, the subject is a human subject. In some embodiments, thehuman subject is diagnosed, suspected of having, or suspected of beingat increased risk for a disease. In some embodiments, the disease iscancer. In some embodiments, the template nucleic acid comprises agenomic region. In some embodiments, the template nucleic acid comprisesDNA, RNA, or cDNA. In some embodiments, the reaction mixture furthercomprises a polymerase. In some embodiments, the polymerase is a DNApolymerase. In some embodiments, the reaction mixture comprises (a) afirst template nucleic acid; (b) an amount of a forward primer; (c) anamount of a reverse primer, wherein the amount of reverse primer is atleast 2 to 10-fold less than the amount of the forward primer; and (d) alow Tm probe.

In some embodiments, the reaction mixture comprises a plurality ofprimer/probe sets. In some embodiments, wherein each primer/probe set ofthe plurality is specific for a different region of genomic DNA. In someembodiments, the genomic region is associated with a disease-relatedmutation. In some embodiments, the mutation comprises a copy numbervariation. In some embodiments, the mutation comprises a singlenucleotide polymorphism (SNP), insertion, deletion, or inversion. Insome embodiments, wherein one of the forward or reverse primers overlaysthe SNP, insertion, deletion, or inversion. In some embodiments, the lowT_(m) probe overlays the SNP, insertion, deletion, or inversion. In someembodiments, the disease is a cancer. In some embodiments, one or bothprimers comprise a probe binding site, and the low T_(m) probe binds tothe probe binding site on either the forward or reverse primer, or both.

In some embodiments, the primer/probe set comprises a plurality of lowTm probes, wherein each low Tm probe is an allele-specific probedesigned to bind with greater avidity to a sequence comprising onespecific allele of the genomic region as compared to a sequencecomprising any other allele of the genomic region, wherein eachallele-specific probe is specific for a different allele.

In some embodiments, each of the allele-specific probes each comprises aspectrally distinct fluorophore.

In some embodiments, the difference in binding energy of an allelespecific probe to the one specific allele as compared to a bindingenergy of the allele specific probe to any other allele is more than 1%of the overall binding energy of the low Tm probe to the genomic region.In some embodiments, the low Tm probe is a beacon probe. In someembodiments, the low Tm probe is a Pleiades probe.

In a related aspect, the disclosure provides a method, the methodcomprising partitioning a reaction mixture comprising a low Tm probe asdescribed herein into a plurality of reaction volumes; and performing,in at least one of the reaction volumes, a PCR amplification reactioncomprising multiple rounds of thermal cycling, wherein the low Tm probedoes not affect efficiency of the PCR amplification reaction.

In some embodiments, the low Tm probe does not hybridize to a templatenucleic acid or PCR reaction product during an annealing phase orextension phase of the PCR amplification reaction. In some embodiments,the method further comprises cooling at least one of the reactionvolumes to below 50° C., wherein the cooling enables hybridization ofthe low Tm probe to a template nucleic acid or PCR reaction product. Insome embodiments the template nucleic acid or PCR reaction productcomprises a sequence having at least 70% complementarity to the low Tmprobe.

In some embodiments, the method comprises cooling at least one of thereaction volumes to below 37° C., wherein the cooling enableshybridization of at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70% of anamount of low Tm probes to nucleic acids comprising a sequence having atleast 70% complementarity to the low Tm probe. In some embodiments, thepartitioning results in each reaction volume containing on average <1,1, or more than 1 molecule of template nucleic acid. In someembodiments, the partitioning results in each reaction volume containingon average 1 or more molecules of template nucleic acid.

In some embodiments, the method comprises performing an exponential PCRamplification reaction and a linear PCR amplification reaction in atleast one of the reaction volumes.

In some embodiments, the exponential PCR amplification and the linearPCR amplification reaction occurs sequentially without adding orremoving components from the reaction volumes.

In some embodiments, the PCR amplification reaction results in at least1%, 5%, 10%, 20%, 30%, 40%, or 50% of the amplification products beingsingle-stranded amplification products.

In some embodiments, the reaction volumes are droplets. In someembodiments, the hybridization results in emission of fluorescence fromthe low Tm probe. In some embodiments, the method further comprisesdetecting the presence or absence of the fluorescence in at least one ofthe reaction volumes. In some embodiments, the method comprisesmeasuring intensity of the fluorescence in the reaction volumes. In someembodiments, the method further comprises determining a number and/orfraction of fluorescence-positive reaction volumes. In some embodiments,the method comprises determining the presence, absence, or amount of oneor more mutations in the sample based on the number and/or fraction offluorescence-positive reaction volumes. In some embodiments, the one ormore mutations comprise a SNP, deletion, insertion, or inversion. Insome embodiments, the one or more mutations comprise a copy numbervariation of a gene. In some embodiments, the one or more mutationscomprise a disease-related mutation. In some embodiments, the disease iscancer. In some embodiments, the one or more mutations comprises amutation of one or more genes selected from the group consisting ofABCA1, BRAF, CHD5, EP300, FLT1, ITPA, MYC, PIK3R1, SKP2, TP53, ABCA7,BRCA1, CHEK1, EPHA3, FLT3, JAK1, MYCL1, PIK3R2, SLC19A1, TP73, ABCB1,BRCA2, CHEK2, EPHA5, FLT4, JAK2, MYCN, PKHD1, SLC1A6, TPM3, ABCC2,BRIP1, CLTC, EPHA6, FN1, JAK3, MYH2, PLCB1, SLC22A2, TPMT, ABCC3, BUB1B,COL1A1, EPHA7, FOS, JUN, MYH9, PLCG1, SLCO1B3, TPO, ABCC4, Clorf144,COPS5, EPHA8, FOXO1, KBTBD11, NAV3, PLCG2, SMAD2, TPR, ABCG2, CABLES1,CREB1, EPHB1, FOXO3, KDM6A, NBN, PML, SMAD3, TRIO, ABL1, CACNA2D1,CREBBP, EPHB4, FOXP4, KDR, NCOA2, PMS2, SMAD4, TRRAP, ABL2, CAMKV, CRKL,EPHB6, GAB1, KIT, NEK11, PPARG, SMARCA4, TSC1, ACVR1B, CARD11, CRLF2,EPO, GATA1, KLF6, NF1, PPARGC1A, SMARCB1, TSC2, ACVR2A, CARM1, CSF1R,ERBB2, GLI1, KLHDC4, NF2, PPP1R3A, SMO, TTK, ADCY9, CAV1, CSMD3, ERBB3,GLI3, KRAS, NKX2-1, PPP2R1A, SOCS1, TYK2, AGAP2, CBFA2T3, CSNK1G2,ERBB4, GNA11, LMO2, NOS2, PPP2R1B, SOD2, TYMS, AKT1, CBL, CTNNA1, ERCC1,GNAQ, LRP1B, NOS3, PRKAA2, SOS1, UGT1A1, AKT2, CCND1, CTNNA2, ERCC2,GNAS, LRP2, NOTCH1, PRKCA, SOX10, UMPS, AKT3, CCND2, CTNNB1, ERCC3,GPR124, LRP6, NOTCH2, PRKCZ, SOX2, USP9X, ALK, CCND3, CYFIP1, ERCC4,GPR133, LTK, NOTCH3, PRKDC, SP1, VEGF, ANAPC5, CCNE1, CYLD, ERCC5, GRB2,MAN1B1, NPM1, PTCH1, SPRY2, VEGFA, APC, CD40LG, CYP19A1, ERCC6, GSK3B,MAP2K1, NQO1, PTCH2, SRC, VHL,APC2, CD44, CYP1B1, ERG, GSTP1, MAP2K2,NR3C1, PTEN, ST6GAL2, WRN, AR, CD79A, CYP2C19, ERN2, GUCY1A2, MAP2K4,NRAS, PTGS2, STAT1, WT1, ARAF, CD79B, CYP2C8, ESR1, HDAC1, MAP2K7, NRP2,PTPN11, STAT3, XPA, ARFRP1, CDC42, CYP2D6, ESR2, HDAC2, MAP3K1, NTRK1,PTPRB, STK11, XPC, ARID1A, CDC42BPB, CYP3A4, ETV4, HGF, MAPK1, NTRK2,PTPRD, SUFU, ZFY, ATM, CDC73, CYP3A5, EWSR1, HIF1A, MAPK3, NTRK3, RAD50,SULT1A1, ZNF521,ATP5A1, CDH1, DACH2, EXT1, HM13, MAPK8, OMA1, RAD51,SUZ12, ATR, CDH10, DCC, EZH2, HMGA1, MARK3, OR10R2, RAF1, TAF1, AURKA,CDH2, DCLK3, FANCA, HNF1A, MCL1, PAK3, RARA, TBX22, AURKB, CDH20, DDB2,FANCD2, HOXA3, MDM2, PARP1, RB1, TCF12, BAI3, CDH5, DDR2, FANCE, HOXA9,MDM4, PAX5, REM1, TCF3, BAP1, CDK2, DGKB, FANCF, HRAS, MECOM, PCDH15,RET, TCF4, BARD1, CDK4, DGKZ, FAS, HSP90AA1, MEN1, PCDH18, RICTOR, TEK,BAX, CDK6, DIRAS3, FBXW7, IDH1, MET, PCNA, RIPK1, TEP1, BCL11A, CDK7,DLG3, FCGR3A, IDH2, MITF, PDGFA, ROR1, TERT, BCL2, CDK8, DLL1, FES,IFNG, MLH1, PDGFB, ROR2, TET2, BCL2A1, CDKN1A, DNMT1, FGFR1, IGF1R, MLL,PDGFRA, ROS1, TGFBR2, BCL2L1, CDKN1B, DNMT3A, FGFR2, IGF2R, MLL3,PDGFRB, RPS6KA2, THBS1, BCL2L2, CDKN2A, DNMT3B, FGFR3, IKBKE, MPL,PDZRN3, RPTOR, TNFAIP3, BCL3, CDKN2B, DOT1L, FGFR4, IKZF1, MRE11A,PHLPP2, RSPO2, TNKS, BCL6, CDKN2C, DPYD, FH, IL2RG, MSH2, PIK3C3, RSPO3,TNKS2, BCR, CDKN2D, E2F1, FHOD3, INHBA, MSH6, PIK3CA, RUNX1, TNNI3K,BIRC5, CDX2, EED, FIGF, INSR, MTHFR, HADH, RPP30, ZFP3, PIK3CB, SDHB,TNR, BIRC6, CEBPA, EGF, FLG2, IRS1, MTOR, PIK3CD, SF3B1, TOP1, BLM,CERK, EGFR, FLNC, IRS2, MUTYH, PIK3CG, SHC1, and TOP2A.

In some embodiments, the one or more mutations comprises a mutation ofone or more genes selected from the group consisting of DDR2, EGFR,AURKA, VEGFA, FGFR1, CDK4, EFBB2, CDK6, JAK2, MET, BRAF, ERBB3, and SRC.

In some embodiments, the method comprises generating a reportcommunicating a profile of the presence, absence, and/or level of themutation in the sample. In some embodiments, the report furthercomprises a description of a therapeutic agent targeting the mutation.

In a related aspect, the disclosure provides a computer system,comprising: a memory unit configured to receive data from a sample,wherein the data is generated by any of the foregoing methods employinga low Tm probe; computer executable instructions for analysis of thedata; and computer executable instructions to determine the presence,absence, or amount of a mutation or template in the sample based on theanalysis. In some embodiments, the computer system further comprisescomputer executable instructions to generate a report of the presence,absence, or amount of a mutation in the sample. In some embodiments, thecomputer system further comprises computer executable instructions togenerate a report of therapeutic options based on the presence, absence,or amount of a mutation in the sample. In some embodiments, the computersystem further comprises a user interface configured to communicate ordisplay the report to a user.

In yet another related aspect, the disclosure provides a kit,comprising: at least one primer/probe set, wherein the primer/probe setcomprises (i) a forward primer designed to hybridize to a genomic regionat a first location, (ii) a reverse primer designed to hybridize to thegenomic region at a second location, and (iii) a low Tm probe describedherein, wherein the low Tm probe is designed to hybridize to the genomicregion at a third location.

The disclosure also provides a method of treating cancer in a subject inneed thereof, comprising: (a) obtaining a biological sample from thesubject; (b) from a nucleic acid sample isolated from the biologicalsample, determining a presence or absence of a copy number variation(CNV) in at least five genes selected from the group consisting of MET,FGFR1, FGFR2, FLT3, HER3, EGFR, mTOR, CDK4, HER2, RET, HADH, ZFP3, DDR2,AURKA, VEGFA, CDK6, JAK2, BRAF, and SRC; (c) based on the determining,generating a subject-specific CNV profile; and (d) based on thesubject-specific CNV profile, selecting a cancer therapy for thesubject. In some embodiments, the determining a presence or absence of aCNV comprises use of any of the foregoing methods. In some embodiments,the determining comprises a digital PCR assay. In some embodiments, thedigital PCR assay comprises use of any of the foregoing oligonucleotideprobes. In some embodiments, the oligonucleotide probe comprises anucleotide sequence of any of SEQ ID NOS: 61, 64, 67, 70, 73, 76, 79,82, 85, 88, 91, 94, 97, 100, 103, 106, 109, 112, 115, or 118. In someembodiments, the digital PCR assay comprises use of any of the foregoingprimers. In some embodiments, the primer comprises a nucleotide sequenceof any of SEQ ID NOS. 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 75,77, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101,102, 104, 105, 107, 108, 110, 111, 113, 114, 116, or 117. In someembodiments, the method comprises determining of presence or absence ofa CNV in at least 10, 12, or 18 genes. In some embodiments, thebiological sample is suspected of harboring nucleic acids originatingfrom the cancer. In some embodiments, the biological sample is a solidtissue sample. In some embodiments, the solid tissue sample is aformalin fixed, paraffin embedded sample. In some embodiments, thebiological sample is a liquid biological sample. In some embodiments,the liquid biological sample is selected from the group consisting ofblood, serum, plasma, urine, sweat, tears, saliva, mucosal secretionsand sputum.

The disclosure also provides a computer system, comprising: (a) a memoryunit configured to receive data from a sample, wherein the data isgenerated by any of the foregoing methods; (b) computer executableinstructions for analysis of the data; and (c) computer executableinstructions to determine the presence, absence, or amount of a mutationin the sample based on the analysis. In some embodiments, the computersystem further comprises computer executable instructions to generate areport of the presence, absence, or amount of a mutation in the sample.In some embodiments, the computer system further comprises computerexecutable instructions to generate a report of therapeutic optionsbased on the presence, absence, or amount of a mutation in the sample.In some embodiments, the computer system further comprises a userinterface configured to communicate or display the report to a user.

The disclosure also provides a kit, comprising: (a) at least oneprimer/probe set, wherein the primer/probe set comprises (i) a forwardprimer designed to hybridize to a genomic region at a first location,(ii) a reverse primer designed to hybridize to the genomic region at asecond location, and (iii) an oligonucleotide probe as previously setforth, wherein the oligonucleotide probe is designed to hybridize to thegenomic region at a third location located between the first and secondlocation; and (b) instructions for use.

The disclosure also provides an oligonucleotide probe as set forth inany of SEQ ID NO: 4-21, 23, 24, 61, 64, 67, 70, 73, 76, 79, 82, 85, 88,91, 94, 97, 100, 103, 106, 109, 112, 115, or 118.

The disclosure also provides a target-selective oligonucleotide as setforth in any of SEQ. ID. NOS: 1948-5593.

The disclosure also provides an oligonucleotide primer having a sequenceas set forth in SEQ ID NO: 25 or 26.

The disclosure also provides an oligonucleotide primer having a sequenceas set forth in any of SEQ ID NOS. 1-3, 22, 27-58, 59, 60, 62, 63, 65,66, 68, 69, 71, 72, 74, 75, 77, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92,93, 95, 96, 98, 99, 101, 102, 104, 105, 107, 108, 110, 111, 113, 114,116, or 117.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each individual publication, patent, or patent application wasspecifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the disclosure are set forth with particularity inthe appended claims. A better understanding of the features andadvantages of the present disclosure will be obtained by reference tothe following detailed description that sets forth illustrativeembodiments, in which the principles of the disclosure are utilized, andthe accompanying drawings of which:

FIG. 1 depicts an exemplary workflow of a method for assessing cancer ina subject.

FIG. 2 depicts an exemplary workflow of a method for sequencing a tumorcell and a normal cell in a subject. FIG. 2 discloses SEQ ID NOS119-120, respectively, in order of appearance.

FIG. 3 depicts an exemplary workflow for a method of preparing a DNAlibrary from a tumor sample of a subject.

FIG. 4 depicts an exemplary embodiment of a method of preparing a DNAlibrary from a tumor sample of a subject.

FIG. 5 depicts an exemplary embodiment of a method of assessingtumor-specific mutations in cell-free DNA from a blood sample of asubject

FIG. 6 depicts an exemplary workflow for allele detection in a sample.

FIG. 7 depicts an exemplary workflow for wild-type and mutant alleledetection in a sample.

FIG. 8 depicts an exemplary embodiment of a subject-specific report oftumor-specific mutations in a subject.

FIG. 9 depicts an exemplary computer system of the disclosure.

FIG. 10A depicts an exemplary workflow of a ligation method of thedisclosure.

FIG. 10B depicts an exemplary method for preparing a single-stranded DNAlibrary.

FIG. 11 depicts an exemplary embodiment of a ligation method of thedisclosure.

FIG. 12 depicts an exemplary workflow of a method of preparing a nucleicacid library for sequencing.

FIGS. 13A and 13B depict exemplary embodiments of a method of preparinga single-adaptor nucleic acid library for sequencing.

FIGS. 14A and 14B depict exemplary embodiments of a method of ligating asecond adaptor sequence to a single-adaptor ligated library member.

FIG. 15 depicts an exemplary method of cloning an insert into a plasmidvector using a high efficiency ligation method.

FIG. 16 depicts an exemplary workflow of a method for sensitivedetection of amplicons.

FIG. 17 depicts an exemplary embodiment of a method for sensitivedetection of amplicons.

FIG. 18 depicts an exemplary embodiment of a real-time detection methodfor sensitive detection of amplicons.

FIG. 19 depicts an exemplary embodiment of an exponential PCR-baseddetection method for sensitive detection of amplicons.

FIG. 20 depicts an exemplary embodiment of a linear PCR-based detectionmethod for sensitive detection of amplicons.

FIG. 21 depicts an exemplary embodiment of a PCR-based detection methodthat utilizes exponential amplification followed by linearamplification.

FIGS. 22A-22B depict an exemplary embodiment of an allele discriminationassay.

FIG. 23 depicts another exemplary embodiment of an allele discriminationassay.

FIG. 24 depicts a method used to assess a cancer in a subject with coloncancer.

FIG. 25 and FIGS. 26A-26D depict results from a validation assay for atumor-specific mutation in the subject with colon cancer.

FIG. 27 depicts an exemplary embodiment of a method for quantitatingefficiency of a ligation method described herein.

FIG. 28 depicts ddPCR results for the 5′ end adaptor ligation and 3′ endadaptor ligation reactions, respectfully.

FIG. 29 depicts results from a ligation experiment testing adaptorlength and PEG-8000 on Ligation Efficiency.

FIG. 30 depicts results from a ligation experiment testing the effect ofMn²⁺ vs. incubation temperature.

FIG. 31 depicts an exemplary embodiment of sequencing using an IlluminaNGS platform.

FIGS. 32 and 33 depict exemplary embodiments of a target-selectiveoligonucleotide (TSO) primer. FIGS. 32 and 33 disclose SEQ ID NOS121-124, respectively, in order of appearance.

FIGS. 34A-34D depict results from an experiment for the assessment oflow Tm probe designs. FIGS. 34A-34D disclose SEQ ID NOS 6-8, 10, 12, 9,11, 13, 15-16, 14, 17-18, 20, 19 and 21, respectively, in order ofappearance.

FIGS. 35A-35B, 36A-36B, 37A-37B, and 38A-38B depict results from ddPCRassays testing various primer/probe designs for detection of BRAFalleles.

FIGS. 39-40 demonstrate detection limits of the BRAF low Tm universalprobes with barcoded primers.

FIG. 41 depicts results from a numerical analysis to determine exemplaryinput amounts for a 20,000 partition digital PCR experiment.

FIGS. 42A-42B and 43A-43D depict use of CNV ddPCR panel for selectingeffective cancer treatment in a patient with colon cancer which hasmetastasized to the liver.

FIGS. 44A-44B depict results from a single assay which can detect copynumber variation and mutation of a gene.

FIGS. 45A-45B illustrate a solution-phase embodiment of a method forlibrary preparation from genomic DNA or RNA for sequencing (e.g.,targeted sequencing), including ligation of an adaptor to the 5′-end ofgDNA or RNA fragments, extension of TSO(s) hybridized to 5′-adaptedfragment(s) containing target DNA or RNA sequence(s), and PCRamplification of the extension product(s).

FIGS. 46A-46B illustrate a solid-phase embodiment of a method forlibrary preparation from genomic DNA or RNA for sequencing (e.g.,targeted sequencing), including ligation of a solid phase-bound adaptorto the 5′-end of gDNA or RNA fragments, extension of TSO(s) hybridizedto solid phase-bound, 5′-adapted fragment(s) containing target DNA orRNA sequence(s), and PCR amplification of the extension product(s).

FIG. 47A depicts an embodiment of a method for ligating a first adaptorto the 5′-end of DNA or RNA fragments and then ligating a second adaptorto the 3′-end of 5′-adapted DNA or RNA fragments.

FIG. 47B depicts an embodiment of a method for ligating a first adaptorto the 3′-end of DNA or RNA fragments and then ligating a second adaptorto the 5′-end of 3′-adapted DNA or RNA fragments.

FIG. 48 illustrates the dependence of a fluorescence signal (in relativefluorescence units or RFU) on the relative orientation of thefluorophore and quencher upon binding to its complementary sequence as afunction of temperature.

FIG. 49 illustrates a method of cancer patient monitoring (longitudinalassay).

FIG. 50 illustrates how probe coverage performance can be analyzed as alinear combination of parameters x_(n), where each parameter can beaccorded a different significance or weighting.

FIG. 51 illustrates a DNA preparation and library generation workflow.

FIG. 52 illustrates profile of T_(0.7) (° C.) of 40-mer probes.

FIG. 53 illustrates profile of T_(0.7) (° C.) of isoTM probes.

FIG. 54 illustrates a method for determining the ratio of a gene in atarget sample to a reference sample based on the total number of basecounts as determined through sequencing

FIG. 55 illustrates a test for Copy Number Alterations (CNAs) based on aThompson Tau test for outliers within a distribution

FIG. 56 illustrates correlation of observed copy number alterations withexpected copy number alterations from a Cancer Cell Line Encyclopedia(CCLE) dataset (16 cell lines) and measured allele frequencies withexpected allele frequencies from a Cancer Cell Line Encyclopedia (CCLE)dataset (16 cell lines).

FIG. 57 illustrates correlation with ddPCR—quantitative sequencing.

FIG. 58A provides a list of variants of putative significance called bya data analysis pipeline of DNA (30 ng) purified from fresh frozen corebiopsy from lung and sequenced.

FIG. 58B provides a distribution of gene ratios called across the panelof 96 genes. ERBB2 (HER2) was identified as amplified at a p<0.005.

FIG. 58C provides a comparison of ratio calls for 12 genes determinedwith library formation and DNA sequencing provided herein versus a CLIAvalidated ddPCR test showing a high correlation (R²=0.999) between thetwo orthogonal methods.

FIG. 59A shows an analysis of DNA (14 ng) purified from plasma followingpost-radiative treatment with observed distribution of gene ratiosacross panel of 96 genes, identifying CCND1 as amplified at a p<0.005.

FIG. 59B shows that an interrogation of the TCGA dataset(www.cbioportal.com) revealed the highest incidence of CCND1amplifications in esophageal cancer.

FIG. 60 illustrates a solution-phase embodiment of a method for librarypreparation from DNA, e.g., genomic DNA or RNA for sequencing.

FIG. 61 illustrates a method for library preparation from DNA or RNA forsequencing.

FIG. 62 illustrates a method for library preparation using a primer witha 5′ phosphate.

DETAILED DESCRIPTION

The practice of the present disclosure will employ, unless otherwiseindicated, techniques of molecular biology, microbiology and recombinantDNA techniques, which are within the skill of the art. Such techniquesare explained fully in the literature. See, e.g., Sambrook, Fritsch &Maniatis, Molecular Cloning: A Laboratory Manual, Fourth Edition (2012);Oligonucleotide Synthesis (M. J. Gait, ed., 1984); Nucleic AcidHybridization (B. D. Hames & S. J. Higgins, eds., 1984); A PracticalGuide to Molecular Cloning (B. Perbal, 1984); and a series, Methods inEnzymology (Academic Press, Inc.). All patents, patent applications, andpublications mentioned herein, both supra and infra, are herebyincorporated by reference.

DEFINITIONS

As used in the specification and claims, the singular forms “a”, “an”and “the” can include plural references unless the context clearlydictates otherwise. For example, the term “a cell” can include aplurality of cells, including mixtures thereof.

The term “subject”, as used herein, generally refers to a biologicalentity containing expressed genetic materials. The biological entity canbe a plant, animal, or microorganism, including, e.g., bacteria,viruses, fungi, and protozoa. The subject can be tissues, cells andtheir progeny of a biological entity obtained in vivo or cultured invitro. The subject can be a mammal. The mammal can be a human. The humanmay be diagnosed or suspected of being at high risk for a disease. Thedisease can be cancer. The human may not be diagnosed or suspected ofbeing at high risk for a disease.

As used herein, a “sample” or “nucleic acid sample” can refer to anysubstance containing or presumed to contain nucleic acid. The sample canbe a biological sample obtained from a subject. The nucleic acids can beRNA, DNA, e.g., genomic DNA, mitochondrial DNA, viral DNA, syntheticDNA, or cDNA reverse transcribed from RNA. The nucleic acids in anucleic acid sample generally serve as templates for extension of ahybridized primer. In some embodiments, the biological sample is aliquid sample. The liquid sample can be whole blood, plasma, serum,ascites, cerebrospinal fluid, sweat, urine, tears, saliva, buccalsample, cavity rinse, or organ rinse. The liquid sample can be anessentially cell-free liquid sample (e.g., plasma, serum, sweat,cerebrospinal fluid, mucosal secretion, urine, sweat, tears, saliva,sputum, or amniotic fluid). In other embodiments, the biological sampleis a solid biological sample, e.g., feces or tissue biopsy, e.g., atumor biopsy. A sample can also comprise in vitro cell cultureconstituents (including but not limited to conditioned medium resultingfrom the growth of cells in cell culture medium, recombinant cells andcell components). The sample can comprise a single cell, e.g., a cancercell, a circulating tumor cell, a cancer stem cell, and the like. Insome cases, a sample can be media, e.g., culture media in which cellsare cultured, e.g., human cells, e.g., human cell lines, e.g., humancell lines derived from tumor tissue. The media can comprise nucleicacid, e.g., DNA or RNA, e.g., tumor DNA or tumor RNA, e.g., circulatingtumor DNA or circulating tumor RNA. The media can comprise circulatingnucleic acid, e.g., circulating DNA or RNA.

“Nucleotides” and “nt” are used interchangeably herein to generallyrefer to biological molecules that can form nucleic acids. Nucleotidescan have moieties that contain not only the known purine and pyrimidinebases, but also other heterocyclic bases that have been modified. Suchmodifications include methylated purines or pyrimidines, acylatedpurines or pyrimidines, alkylated riboses, or other heterocycles. Inaddition, the term “nucleotide” includes those moieties that containhapten, biotin, or fluorescent labels and may contain not onlyconventional ribose and deoxyribose sugars, but other sugars as well.Modified nucleosides or nucleotides also include modifications on thesugar moiety, e.g., wherein one or more of the hydroxyl groups arereplaced with halogen atoms or aliphatic groups, are functionalized asethers, amines, or the like. Modified nucleosides or nucleotides canalso include peptide nucleic acid (PNA). Peptide nucleic acid generallyrefers to oligonucleotides in which the deoxyribose backbone has beenreplaced with a backbone having peptide linkages. Each subunit generallyhas attached a naturally occurring or non-naturally occurring base. Oneexemplary PNA backbone is constructed of repeating units ofN-(2-aminoethyl) glycine linked through amide bonds. PNA can bind bothDNA and RNA to form PNA/DNA or PNA/RNA duplexes. The resulting PNA/DNAor PNA/RNA duplexes can be bound with greater affinity thancorresponding DNA/DNA or DNA/RNA duplexes as evidence by their highermelting temperatures (Tm). The neutral backbone of the PNA also canrender the Tm of PNA/DNA (RNA) duplexes to be largely independent ofsalt concentration in a reaction mixture. Thus the PNA/DNA duplex canoffer an advantage over DNA/DNA duplex interactions which are highlydependent on ionic strength. Exemplary embodiments of PNA are describedin U.S. Pat. Nos. 7,223,833 and 5,539,083, which are hereby incorporatedby reference.

“Nucleotides” can also include nucleotides comprising a Tm-enhancingbase (e.g., a Tm-base enhancing nucleotide). Exemplary Tm-enhancing basenucleotides include, but are not limited to nucleotides withSuperbases™, locked nucleic acids (LNA) or bridged nucleic acids (BNA).BNA and LNA generally refer to modified ribonucleotides wherein theribose moiety is modified with a bridge connecting the 2′ oxygen and 4′carbon. Generally, the bridge “locks” the ribose in the 3′-endo (North)conformation, which is often found in the A-form duplexes. The term“locked nucleic acid” (LNA) generally refers to a class of BNAs, wherethe ribose ring is “locked” with a methylene bridge connecting the 2′-Oatom with the 4′-C atom. LNA nucleosides containing the six commonnucleobases (T, C, G, A, U and mC) that appear in DNA and RNA are ableto form base-pairs with their complementary nucleosides according to thestandard Watson-Crick base pairing rules. Accordingly, Tm-enhancing basenucleotides such as BNA and LNA nucleotides can be mixed with DNA or RNAbases in an oligonucleotide whenever desired. The locked riboseconformation enhances base stacking and backbone pre-organization. Basestacking and backbone pre-organization can give rise to an increasedthermal stability (e.g., increased Tm) and discriminative power ofduplexes. LNA can discriminate single base mismatches under conditionsnot possible with other nucleic acids. Locked nucleic acid is disclosedfor example in WO 99/14226, hereby incorporated by reference.Nucleotides can also include modified nucleotides as described inEuropean Patent Application No. EP1995330, hereby incorporated byreference.

Other modified nucleotides can include 5-Me-dC-CE phosphoramidite,5-Me-dC-CPG, 2-Amino-dA-CE phosphoramidite, N4-Et-dC-CE Phosphoramidite,N4-Ac-N4-Et-dC-CE Phosphoramidite, N6-Me-dA-CE Phosphoramidite,N6-Ac-N6-Me-dA-CE Phosphoramidite, Zip nucleic acids (ZNA®, described inU.S. patent application Ser. No. 12/086,599, hereby incorporated byreference), 5′-Trimethoxystilbene Cap Phosphoramidite, 5′-Pyrene CapPhosphoramidite, 3′-Uaq Cap CPG. (Glen Research).

Yet other modified nucleotides can include nucleotides with modifiednucleoside bases such as, e.g., 2-Aminopurine, 2,6-Diaminopurine,5-Bromo-deoxyuridine, deoxyuridine, Inverted dT, inverted ddT, ddC,5-Methyl deoxycytidine, deoxyInosine, 5-Nitroindole, 2′-O-Methyl RNAbases, Hydroxmethyl dC, Iso-dG and Iso-dC (Eragen Biosciences, Inc), 2′Fluoro bases having a fluorine modified ribose.

The terms “polynucleotides”, “nucleic acid”, “nucleotides” and“oligonucleotides” can be used interchangeably. They can refer to apolymeric form of nucleotides of any length, either deoxyribonucleotidesor ribonucleotides, or analogs thereof. Polynucleotides may have anythree-dimensional structure, and may perform any function, known orunknown. The following are non-limiting examples of polynucleotides:coding or non-coding regions of a gene or gene fragment, loci (locus)defined from linkage analysis, exons, introns, messenger RNA (mRNA),transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinantpolynucleotides, branched polynucleotides, plasmids, vectors, isolatedDNA of any sequence, isolated RNA of any sequence, nucleic acid probes,and primers. A polynucleotide may comprise modified nucleotides, such asmethylated nucleotides and nucleotide analogs. If present, modificationsto the nucleotide structure may be imparted before or after assembly ofthe polymer. The sequence of nucleotides may be interrupted bynon-nucleotide components. A polynucleotide may be further modifiedafter polymerization, such as by conjugation with a labeling component.

The term “target polynucleotide,”, “target region”, or “target”, as useherein, generally refers to a polynucleotide of interest under study. Incertain embodiments, a target polynucleotide contains one or moresequences that are of interest and under study. A target polynucleotidecan comprise, for example, a genomic sequence. The target polynucleotidecan comprise a target sequence whose presence, amount, and/or nucleotidesequence, or changes in these, are desired to be determined.

The target polynucleotide can be a region of gene associated with adisease. In some embodiments, the region is an exon. In someembodiments, the gene is a druggable target. The term “druggabletarget”, as used herein, generally refers to a gene or cellular pathwaythat is modulated by a disease therapy. The disease can be cancer.Accordingly, the gene can be a known cancer-related gene. In someembodiments, the cancer-related gene is selected from the groupconsisting of ABCA1, BRAF, CHD5, EP300, FLT1, ITPA, MYC, PIK3R1, SKP2,TP53, ABCA7, BRCA1, CHEK1, EPHA3, FLT3, JAK1, MYCL1, PIK3R2, SLC19A1,TP73, ABCB1, BRCA2, CHEK2, EPHA5, FLT4, JAK2, MYCN, PKHD1, SLC1A6, TPM3,ABCC2, BRIP1, CLTC, EPHA6, FN1, JAK3, MYH2, PLCB1, SLC22A2, TPMT, ABCC3,BUB1B, COL1A1, EPHA7, FOS, JUN, MYH9, PLCG1, SLCO1B3, TPO, ABCC4,Clorf144, COPS5, EPHA8, FOXO1, KBTBD11, NAV3, PLCG2, SMAD2, TPR, ABCG2,CABLES1, CREB1, EPHB1, FOXO3, KDM6A, NBN, PML, SMAD3, TRIO, ABL1,CACNA2D1, CREBBP, EPHB4, FOXP4, KDR, NCOA2, PMS2, SMAD4, TRRAP, ABL2,CAMKV, CRKL, EPHB6, GAB 1, KIT, NEK11, PPARG, SMARCA4, TSC1, ACVR1B,CARD11, CRLF2, EPO, GATA1, KLF6, NF1, PPARGC1A, SMARCB1, TSC2, ACVR2A,CARM1, CSF1R, ERBB2, GLI1, KLHDC4, NF2, PPP1R3A, SMO, TTK, ADCY9, CAV1,CSMD3, ERBB3, GLI3, KRAS, NKX2-1, PPP2R1A, SOCS1, TYK2, AGAP2, CBFA2T3,CSNK1G2, ERBB4, GNA11, LMO2, NOS2, PPP2R1B, SOD2, TYMS, AKT1, CBL,CTNNA1, ERCC1, GNAQ, LRP1B, NOS3, PRKAA2, SOS1, UGT1A1, AKT2, CCND1,CTNNA2, ERCC2, GNAS, LRP2, NOTCH1, PRKCA, SOX10, UMPS, AKT3, CCND2,CTNNB1, ERCC3, GPR124, LRP6, NOTCH2, PRKCZ, SOX2, USP9X, ALK, CCND3,CYFIP1, ERCC4, GPR133, LTK, NOTCH3, PRKDC, SP1, VEGF, ANAPC5, CCNE1,CYLD, ERCC5, GRB2, MAN1B1, NPM1, PTCH1, SPRY2, VEGFA, APC, CD40LG,CYP19A1, ERCC6, GSK3B, MAP2K1, NQO1, PTCH2, SRC, VHL,APC2, CD44, CYP1B1,ERG, GSTP1, MAP2K2, NR3C1, PTEN, ST6GAL2, WRN, AR, CD79A, CYP2C19, ERN2,GUCY1A2, MAP2K4, NRAS, PTGS2, STAT1, WT1, ARAF, CD79B, CYP2C8, ESR1,HDAC1, MAP2K7, NRP2, PTPN11, STAT3, XPA, ARFRP1, CDC42, CYP2D6, ESR2,HDAC2, MAP3K1, NTRK1, PTPRB, STK11, XPC, ARID1A, CDC42BPB, CYP3A4, ETV4,HGF, MAPK1, NTRK2, PTPRD, SUFU, ZFY, ATM, CDC73, CYP3A5, EWSR1, HIF1A,MAPK3, NTRK3, RAD50, SULT1A1, ZNF521,ATP5A1, CDH1, DACH2, EXT1, HM13,MAPK8, OMA1, RAD51, SUZ12, ATR, CDH10, DCC, EZH2, HMGA1, MARK3, OR10R2,RAF1, TAF1, AURKA, CDH2, DCLK3, FANCA, HNF1A, MCL1, PAK3, RARA, TBX22,AURKB, CDH20, DDB2, FANCD2, HOXA3, MDM2, PARP1, RB1, TCF12, BAI3, CDH5,DDR2, FANCE, HOXA9, MDM4, PAX5, REM1, TCF3, BAP1, CDK2, DGKB, FANCF,HRAS, MECOM, PCDH15, RET, TCF4, BARD1, CDK4, DGKZ, FAS, HSP90AA1, MEN1,PCDH18, RICTOR, TEK, BAX, CDK6, DIRAS3, FBXW7, IDH1, MET, PCNA, RIPK1,TEP1, BCL11A, CDK7, DLG3, FCGR3A, IDH2, MITF, PDGFA, ROR1, TERT, BCL2,CDK8, DLL1, FES, IFNG, MLH1, PDGFB, ROR2, TET2, BCL2A1, CDKN1A, DNMT1,FGFR1, IGF1R, MLL, PDGFRA, ROS1, TGFBR2, BCL2L1, CDKN1B, DNMT3A, FGFR2,IGF2R, MLL3, PDGFRB, RPS6KA2, THBS1, BCL2L2, CDKN2A, DNMT3B, FGFR3,IKBKE, MPL, PDZRN3, RPTOR, TNFAIP3, BCL3, CDKN2B, DOT1L, FGFR4, IKZF1,MRE11A, PHLPP2, RSPO2, TNKS, BCL6, CDKN2C, DPYD, FH, IL2RG, MSH2,PIK3C3, RSPO3, TNKS2, BCR, CDKN2D, E2F1, FHOD3, INHBA, MSH6, PIK3CA,RUNX1, TNNI3K, BIRC5, CDX2, EED, FIGF, INSR, MTHFR, PIK3CB, SDHB, TNR,BIRC6, CEBPA, EGF, FLG2, IRS1, MTOR, PIK3CD, SF3B1, TOP1, BLM, CERK,EGFR, FLNC, IRS2, MUTYH, PIK3CG, SHC1, and TOP2A.

The term “genomic sequence”, as used herein, generally refers to asequence that occurs in a genome. Because RNAs are transcribed from agenome, this term encompasses sequence that exist in the nuclear genomeof an organism, as well as sequences that are present in a cDNA copy ofan RNA (e.g., an mRNA) transcribed from such a genome.

The terms “anneal”, “hybridize” or “bind,” can refer to twopolynucleotide sequences, segments or strands, and can be usedinterchangeably and have the usual meaning in the art. Two complementarysequences (e.g., DNA and/or RNA) can anneal or hybridize by forminghydrogen bonds with complementary bases to produce a double-strandedpolynucleotide or a double-stranded region of a polynucleotide.

As used herein, the term “complementary” generally refers to arelationship between two antiparallel nucleic acid sequences in whichthe sequences are related by the base-pairing rules: A pairs with T or Uand C pairs with G. A first sequence or segment that is “perfectlycomplementary” to a second sequence or segment is complementary acrossits entire length and has no mismatches. A first sequence or segment is“substantially complementary” to a second sequence of segment when apolynucleotide consisting of the first sequence is sufficientlycomplementary to specifically hybridize to a polynucleotide consistingof the second sequence.

The term “duplex,” or “duplexed,” as used herein, can describe twocomplementary polynucleotides that are base-paired, e.g., hybridizedtogether.

As used herein, the term “Tm” generally refers to the meltingtemperature of an oligonucleotide duplex at which half of the duplexesremain hybridized and half of the duplexes dissociate into singlestrands. See Sambrook and Russell (2001; Molecular Cloning: A LaboratoryManual, 3^(rd) ed., Cold Spring Harbor Press, Cold Spring Harbor N.Y.,ch. 10).

As used herein, “amplification” of a nucleic acid sequence generallyrefers to in vitro techniques for enzymatically increasing the number ofcopies of a target sequence. Amplification methods include bothasymmetric methods (in which the predominant product is single-stranded)and other methods (e.g., in which the predominant product isdouble-stranded). A “round” or “cycle” of amplification can refer to aPCR cycle in which a double stranded template DNA molecule is denaturedinto single-stranded templates, forward and reverse primers arehybridized to the single stranded templates to form primer/templateduplexes, primers are extended by a DNA polymerase from theprimer/template duplexes to form extension products. In subsequentrounds of amplification the extension products are denatured into singlestranded templates and the cycle is repeated.

The terms “template”, “template strand”, “template DNA” and “templatenucleic acid” can be used interchangeably herein to refer to a strand ofDNA that is copied by an amplification cycle.

The term “denaturing,” as used herein, generally refers to theseparation of a nucleic acid duplex into two single strands.

The term “extending”, as used herein, generally refers to the extensionof a primer hybridized to a template nucleic acid by the addition ofnucleotides using an enzyme, e.g., a polymerase.

A “primer” is generally a nucleotide sequence (e.g., anoligonucleotide), generally with a free 3′-OH group, that hybridizeswith a template sequence (such as a target polynucleotide, or a primerextension product) and is capable of promoting polymerization of apolynucleotide complementary to the template. A primer can be, forexample, a sequence of the template (such as a primer extension productor a fragment of the template created following RNase cleavage of atemplate-DNA complex) that is hybridized to a sequence in the templateitself (for example, as a hairpin loop), and that is capable ofpromoting nucleotide polymerization. Thus, a primer can be an exogenous(e.g., added) primer or an endogenous (e.g., template fragment) primer.

The terms “determining”, “measuring”, “evaluating”, “assessing,”“assaying,” and “analyzing” can be used interchangeably herein to referto any form of measurement, and include determining if an element ispresent or not. These terms can include both quantitative and/orqualitative determinations. Assessing may be relative or absolute.“Assessing the presence of” can include determining the amount ofsomething present, as well as determining whether it is present orabsent.

The term “free in solution,” as used here, can describe a molecule, suchas a polynucleotide, that is not bound or tethered to a solid support.

The term “genomic fragment”, as used herein, can refer to a region of agenome, e.g., an animal or plant genome such as the genome of a human,monkey, rat, fish or insect or plant. A genomic fragment may or may notbe adaptor ligated. A genomic fragment may be adaptor ligated (in whichcase it has an adaptor ligated to one or both ends of the fragment, toat least the 5′ end of a molecule), or non-adaptor ligated.

“Pre-amplification”, as used herein, generally refers to non-clonalamplification of nucleic acids. For example, pre-amplification of anucleic acid library is generally performed prior to clonalamplification of the library and/or loading onto a sequencer.

The term “ligase”, as used herein, generally refers to an enzyme that iscommonly used to join polynucleotides together or to join the ends of asingle polynucleotide.

The term “ligation”, as used herein, generally refers to the joining oftwo ends of polynucleotides or the joining of ends of a singlepolynucleotide by the formation of a covalent bond between the ends tobe joined. The covalent bond can be a phosphodiester bond.

The term “ATP-dependent ligation”, as used herein, generally refers toligation by an ATP-dependent ligase. An exemplary mechanism ofATP-dependent ligation is described herein.

“Donor” and “acceptor” nucleic acid species generally refer to twodistinct populations of nucleic acid molecules to be joined in aligation reaction. The “donor” species generally refers to a populationof nucleic acid molecules which may accept a nucleoside monophosphate(NMP) at either a 5′ or 3′ end. The “acceptor” species generally refersto a second population of nucleic acid molecules containing a 3′ or 5′OH group which may be ligated to the “donor” species via the NMP ateither the 5′ or 3′ end of the donor species.

The donor and acceptor species can be any nucleic acid species. They canbe, for example, polynucleotides isolated from a biological source. Thebiological source can be a subject. Exemplary biological sources andsubjects are described herein. They can be oligonucleotides. Methods forpreparing oligonucleotides of specific sequence are known in the art,and include, for example, cloning and restriction of appropriatesequences and direct chemical synthesis. Chemical synthesis methods mayinclude, for example, the phosphotriester method described by Narang etal., 1979, Methods in Enzymology 68:90, the phosphodiester methoddisclosed by Brown et al., 1979, Methods in Enzymology 68:109, thediethylphosphoramidate method disclosed in Beaucage et al., 1981,Tetrahedron Letters 22:1859, and the solid support method disclosed inU.S. Pat. No. 4,458,066. They can be RNA or DNA. The DNA can bepartially or fully denatured DNA. The DNA can be single stranded (ss)DNA. Partially denatured can be “frayed” at ends such that a “frayed”end can comprise 1, 2, 3, 4, 5, or more than 5 non-annealed nucleotides.

The donor and/or acceptor nucleic acid species can be of any size,ranging from, e.g., 1-50 nt, 10-100 nt, 50-200 nt, 50-2000 nt, 100-400nt, 200-600 nt, 500-1000 nt, 800-2000 nt, or greater than 2000 nt. Insome embodiments, the donor and/or acceptor nucleic acid species is over120 nt long.

The donor or acceptor nucleic acid species can include, e.g., genomicnucleic acids, adaptor sequences, and/or barcode sequences. The donor oracceptor nucleic acid species can include oligonucleotides. The donor oracceptor nucleic acid species can comprise a detectable label oraffinity tag.

The detectable label can be any molecule that enables detection of amolecule to be detected. Non-limiting examples of detectable labelsinclude, e.g., chelators, photoactive agents, radioactive moieties(e.g., alpha, beta and gamma emitters), fluorescent agents, luminescentagents, paramagnetic ions, or enzymes that produce a detectable signalin the presence of certain reagents (e.g., horseradish peroxidase,alkaline phosphatase, glucose oxidase).

Exemplary fluorescent compounds include, e.g., fluoresceinisothiocyanate, rhodamine, phycoerytherin, phycocyanin, allophycocyanin,o-phthaldehyde, fluorescamine, and commercially available fluorophoressuch as Alexa Fluor 350, Alexa Fluor 488, Alexa Fluor 532, Alexa Fluor546, Alexa Fluor 568, Alexa Fluor 594, Alexa Fluor 647, DyLight dyessuch as DyLight 488, DyLight 594, DyLight 647, and BODIPY dyes such asBODIPY 493/503, BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPY TMR,BODIPY 558/568, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, BODIPY581/591, BODIPY TR, BODIPY 630/650, BODIPY 650/665, Cascade Blue,Cascade Yellow, Dansyl, lissamine rhodamine B, Marina Blue, Oregon Green488, Oregon Green 514, Pacific Blue, rhodamine 6G, rhodamine green,rhodamine red, tetramethylrhodamine and Texas Red. Such compounds arecommercially available (see, e.g., Molecular Probes, Inc.).

The affinity tag can be selected to have affinity to a capture moiety.The affinity tag can comprise, by way of non-limiting example only,biotin, desthiobiotin, histidine, polyhistidine, myc, hemagglutinin(HA), FLAG, a fluorescence tag, a tandem affinity purification (TAP)tag, a FLAG tag, a glutathione S transferase (GST) tag, or derivativesthereof. The capture moiety can comprise, e.g., avidin, streptavidin,Neutravidin™, nickel, or glutathione or other molecule capable ofbinding the affinity tag.

In some embodiments, the acceptor species and the donor species can bethe same species. For example, in some embodiments a user may desire tocircularize a linear nucleic acid or to form concatemers of a singlenucleic acid species.

The term “reaction mixture” as used herein generally refers to a mixtureof components necessary to effect a desired reaction. The mixture mayfurther comprise a buffer (e.g., a Tris buffer). The reaction mixturemay further comprise a monovalent salt. The reaction mixture may furthercomprise a cation, e.g., Mg²⁺ and/or Mn²⁺. The concentration of eachcomponent is well known in the art and can be further optimized by anordinary skilled artisan. In some embodiments, the reaction mixture alsocomprises additives including, but not limited to, non-specificbackground/blocking nucleic acids (e.g., salmon sperm DNA), non-specificbackground/blocking proteins (e.g., bovine serum albumin, non-fat drymilk) biopreservatives (e.g. sodium azide), PCR enhancers (e.g. Betaine,Trehalose, etc.), and inhibitors (e.g. RNAse inhibitors). In someembodiments, a nucleic acid sample is admixed with the reaction mixture.

A “primer binding site” can refer to a site to which a primer hybridizesin an oligonucleotide or a complementary strand thereof.

The term “separating”, as used herein, can refer to physical separationof two elements (e.g., by size, affinity, degradation of one elementetc.).

The term “sequencing”, as used herein, can refer to a method by whichthe identity of at least 10 consecutive nucleotides (e.g., the identityof at least 20, at least 50, at least 100, at least 200, or at least 500or more consecutive nucleotides) of a polynucleotide are obtained.

The term “adaptor-ligated”, as used herein, can refer to a nucleic acidthat has been ligated to an adaptor. The adaptor can be ligated to a 5′end or a 3′ end of a nucleic acid molecule, or can be added to aninternal region of a nucleic acid molecule.

The term “bridge PCR” can refer to a solid-phase polymerase chainreaction in which the primers that are extended in the reaction aretethered to a substrate by their 5′ ends. During amplification, theamplicons form a bridge between the tethered primers. Bridge PCR (whichmay also be referred to as “cluster PCR”) is used in Illumina's Solexaplatform. Bridge PCR and Illumina's Solexa platform are generallydescribed in a variety of publications, e.g., Gudmundsson et al (Nat.Genet. 2009 41:1122-6), Out et al (Hum. Mutat. 2009 30:1703-12) andTurner (Nat. Methods 2009 6:315-6), U.S. Pat. No. 7,115,400, andpublication application publication nos. US20080160580 andUS20080286795.

The term “barcode sequence” as used herein, generally refers to a uniquesequence of nucleotides that can encode information about an assay. Abarcode sequence can encode information relating to the identity of aninterrogated allele, identity of a target polynucleotide or genomiclocus, identity of a sample, a subject, a molecule, or any combinationthereof. A barcode sequence can be a portion of a primer, a reporterprobe, or both. A barcode sequence may be at the 5′-end or 3′-end of anoligonucleotide, or may be located in any region of the oligonucleotide.A barcode sequence may or may not be part of a template sequence.Barcode sequences may vary widely in size and composition; the followingreferences provide guidance for selecting sets of barcode sequencesappropriate for particular embodiments: Brenner, U.S. Pat. No.5,635,400; Brenner et al, Proc. Natl. Acad. Sci., 97: 1665-1670 (2000);Shoemaker et al, Nature Genetics, 14: 450-456 (1996); Morris et al,European patent publication 0799897A1; Wallace, U.S. Pat. No. 5,981,179.A barcode sequence may have a length of about 4 to 36 nucleotides, about6 to 30 nucleotides, or about 8 to 20 nucleotides.

The term “mutation”, as used herein, generally refers to a change of thenucleotide sequence of a genome as compared to a reference. Mutationscan involve large sections of DNA (e.g., copy number variation).Mutations can involve whole chromosomes (e.g., aneuploidy). Mutationscan involve small sections of DNA. Examples of mutations involving smallsections of DNA include, e.g., point mutations or single nucleotidepolymorphisms, multiple nucleotide polymorphisms, insertions (e.g.,insertion of one or more nucleotides at a locus), multiple nucleotidechanges, deletions (e.g., deletion of one or more nucleotides at alocus), and inversions (e.g., reversal of a sequence of one or morenucleotides).

The term “locus”, as used herein, can refer to a location of a gene,nucleotide, or sequence on a chromosome. An “allele” of a locus, as usedherein, can refer to an alternative form of a nucleotide or sequence atthe locus. A “wild-type allele” generally refers to an allele that hasthe highest frequency in a population of subjects. A “wild-type” allelegenerally is not associated with a disease. A “mutant allele” generallyrefers to an allele that has a lower frequency that a “wild-type allele”and may be associated with a disease. A “mutant allele” may not have tobe associated with a disease. The term “interrogated allele” generallyrefers to the allele that an assay is designed to detect.

The term “single nucleotide polymorphism”, or “SNP”, as used herein,generally refers to a type of genomic sequence variation resulting froma single nucleotide substitution within a sequence. “SNP alleles” or“alleles of a SNP” generally refer to alternative forms of the SNP atparticular locus. The term “interrogated SNP allele” generally refers tothe SNP allele that an assay is designed to detect.

The term “copy number variation” or “CNV” refers to differences in thecopy number of genetic information. In many aspects it refers todifferences in the per genome copy number of a genomic region. Forexample, in a diploid organism the expected copy number for autosomalgenomic regions is 2 copies per genome. Such genomic regions should bepresent at 2 copies per cell. For a recent review see Zhang et al. Annu.Rev. Genomics Hum. Genet. 2009. 10:451-81. CNV is a source of geneticdiversity in humans and can be associated with complex disorders anddisease, for example, by altering gene dosage, gene disruption, or genefusion. They can also represent benign polymorphic variants. CNVs can belarge, for example, larger than 1 Mb, but many are smaller, for examplebetween 100 bases and 1 Mb. More than 38,000 CNVs greater than 100 bases(and less than 3 Mb) have been reported in humans. Along with SNPs theseCNVs account for a significant amount of phenotypic variation betweenindividuals. In addition to having deleterious impacts, e.g. causingdisease, they may also result in advantageous variation.

The term “structural variation” refers to variation in the structure ofchromosome. Structural variations can be deletions, duplications,copy-number variants, insertions, inversions, and translocations. Insome cases, two regions that are far apart are brought into proximity. Ahybrid gene formed from two previously separate genes, which can bejoined by, for example, by translocation, deletion, or inversion events,can be referred to as a “gene fusion” or “fusion gene.”

In certain cases, an oligonucleotide used in the method described hereinmay be designed using a reference genomic region, i.e., a genomic regionof known nucleotide sequence, e.g., a chromosomal region whose sequenceis deposited at NCBI's Genbank database or other database, for example.

The term “genotyping”, as used herein, generally refers to a process ofdetermining differences in the genetic make-up (genotype) of anindividual by examining the individual's DNA sequence using biologicalassays and comparing it to another individual's sequence or a referencesequence.

A “plurality” generally contains at least 2 members. In certain cases, aplurality may have at least 10, at least 100, at least 100, at least10,000, at least 100,000, at least 1000000, at least 10000000, at least100000000, or at least 1000000000 or more members.

The term “separating”, as used herein, generally refers to physicalseparation of two elements (e.g., by cleavage, hydrolysis, ordegradation of one of the two elements).

The terms “label” and “detectable moiety” can be used interchangeablyherein to refer to any atom or molecule which can be used to provide adetectable signal, and which can be attached to a nucleic acid orprotein. Labels may provide signals detectable by fluorescence,radioactivity, colorimetry, gravimetry, X-ray diffraction or absorption,magnetism, enzymatic activity, and the like.

Overview

Aspects of the disclosure relate to methods and kits that improve themonitoring and treatment of a subject suffering from a disease. Thedisease can be a cancer, e.g., a tumor, a leukemia such as acuteleukemia, acute t-cell leukemia, acute lymphocytic leukemia, acutemyelocytic leukemia, myeloblastic leukemia, promyelocytic leukemia,myelomonocytic leukemia, monocytic leukemia, erythroleukemia, chronicleukemia, chronic myelocytic (granulocytic) leukemia, or chroniclymphocytic leukemia, polycythemia vera, lymphomas such as Hodgkin'slymphoma, follicular lymphoma or non-Hodgkin's lymphoma, multiplemyeloma, Waldenström's macroglobulinemia, heavy chain disease, solidtumors, sarcomas, carcinomas such as, e.g., fibrosarcoma, myxosarcoma,liposarcoma, chondrosarcoma, osteogenic sarcoma, lymphangiosarcoma,mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, coloncarcinoma, colorectal cancer, pancreatic cancer, breast cancer, ovariancancer, prostate cancer, squamous cell carcinoma, basal cell carcinoma,adenocarcinoma, sweat gland carcinoma, sebaceous gland carcinoma,papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma,medullary carcinoma, bronchogenic, carcinoma, renal cell carcinoma,hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonalcarcinoma, Wilms' tumor, cervical cancer, uterine cancer, testiculartumor, lung carcinoma, small cell lung carcinoma, bladder carcinoma,epithelial carcinoma, glioma, craniopharyngioma, ependymoma, pinealoma,hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma,melanoma, neuroblastoma, retinoblastoma, endometrial cancer, non-smallcell lung cancer,

The subject can be suspected or known to harbor a solid tumor, or can bea subject who previously harbored a solid tumor.

FIG. 49 illustrates a method of monitoring a patient's cancer(longitudinal assay). The method can comprise sequencing e.g., massivelyparallel sequencing (next generation sequencing) one or more genes froman initial tumor sample, e.g. a formalin-fixed paraffin embedded (FFPE)sample, a fine needle aspirate (FNA) biopsy, a core needle biopsy (CNB),and/or a cell-free sample (e.g., cell-free plasma sample). An initialsample can be a sample taken from a subject before the subject receivesa cancer treatment. When plasma is used as an initial sample, the amountof DNA used from the sample can be about 1 ng of DNA. When plasma isused as an initial sample, the volume of plasma can be about 3 mL. Insome cases, only a solid tumor sample (e.g., FFPE sample, FNA sample, orCNB sample) for sequencing is obtained from a subject before the subjectreceives a cancer treatment, and nucleic acid from the sample issequenced. In some cases, only a fluid sample (e.g., plasma) forsequencing is taken from a subject before the subject receives a cancertreatment, and nucleic acid is sequenced from the fluid (e.g., plasma)sample. In some cases, both a solid tumor sample and a fluid sample(e.g., plasma) for sequencing are taken from a subject before thesubject receives a cancer treatment, and nucleic acid is sequenced fromthe solid tumor sample and the fluid (e.g., plasma) sample. Sequencingdata from the solid tumor sample and fluid sample taken before thesubject receives a cancer treatment can be compared. In some cases,sequencing data from a solid tumor sample and fluid sample taken beforethe subject receives a cancer treatment are not compared.

The number of genes sequenced in a sample (e.g., initial sample) can beabout, or at least 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 96, 100,110, 120, 129, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500,600, 700, 800, 900 or more genes. The sequencing can occur in a ClinicalLaboratory Improvement Amendments (CLIA) certified laboratory and/orCollege of American Pathologists (CAP) certified laboratory. Analysis ofthe sequencing data (e.g., bioinformatics) can occur in a CLIA and/orCAP certified laboratory.

The sequence data can be used to determine a profile of mutations in thegenes. The profile of mutations can be listed in a report. The reportcan be provided to a caregiver or to the subject from whom one or moresamples were taken. The report can indicate potential therapeuticoptions based on the profile of mutations.

A subsequent sample can be taken from a subject after the initial sampleis taken, e.g., to monitor one or more genes sequenced in an initialsample. A plurality of subsequent samples can be taken from the subject(e.g., about, or at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50,60, 70, 80, 90, 100 samples). The subsequent sample from the subject canbe a fluid sample, e.g., a plasma sample. Nucleic acid, e.g., cell-freenucleic acid, e.g., cell-free DNA from the subsequent sample can beanalyzed. The nucleic acid from the subsequent sample can be analyzed bysequencing, e.g., massively parallel sequencing (next generationsequencing). The nucleic acid in the subsequent sample can be analyzedby amplification, e.g., PCR, e.g., digital PCR (dPCR), e.g., dropletdigital PCR (e.g., ddPCR). Nucleic acid in the subsequent sample can beanalyzed by both amplification (e.g., dPCR, e.g., ddPCR) and sequencing,e.g., massively parallel sequencing (next generation sequencing).

A subsequent sample can be taken from a subject at a regular interval oran irregular interval. A subsequent sample can be taken from a subjectdaily, weekly, twice a month, monthly, quarterly, semi-annually, orannually.

In some cases, subsequent samples can be analyzed by sequencing untilsequencing no longer provides sufficient sensitivity to detect amutation or alteration in a gene identified in an initial sample. Forexample, a mutation can be identified in a gene by sequencing (e.g.,using Illumina® MiSeq) of nucleic acid from an initial solid tumorsample or an initial cell-free sample (e.g., plasma), and sequencing canbe used to detect a presence or absence of the mutation in the gene in asubsequent sample (e.g., fluid sample, e.g., plasma), and whensequencing is no longer able to detect the mutation in the gene in asubsequent sample, an amplification based assay (e.g., dPCR, e.g., ddPCRusing, e.g., a Bio-Rad instrument QX200™ Droplet Digital™ PCR System)can be used to detect a presence or absence of the mutation in the genein subsequent samples. In some cases, an amplification based method,e.g., dPCR, e.g., ddPCR, can have higher sensitivity than a sequencingbased method. In some cases, a mutation detected in an initial samplewill be not be detected in a subsequent sample that is analyzed bysequencing, but will be detected in a subsequent sample that is analyzedby amplification, e.g., ddPCR. In some cases, a mutation present in aninitial sample will not be detected in a subsequent sample analyzed bysequencing and also not detected in a subsequent sample analyzed byamplification (e.g., ddPCR).

The number of genes analyzed in a subsequent sample can be less than thenumber of genes analyzed in an initial sample. The genes analyzed in thesubsequent sample can be a subset of the genes analyzed in an initialsample. The genes analyzed in the subsequent sample can be based on aprofile of mutations identified in the initial sample (a profile ofpersonalized variants). A number of genes analyzed in a subsequentsample can be about, or at least 1, 5, 10, 20, 30, 40, 50, 60, 70, 80,90, 96, 100, 110, 120, 129, 130, 140, 150, 160, 170, 180, 190, 200, 300,400, 500, 600, 700, 800, 900 or more genes. In some cases, a number ofgenes analyzed in a subsequent sample can be more than a number of genesanalyzed in an initial sample. Genes monitored in subsequent samples canbe analyzed to monitor the cancer, monitor effectiveness of a treatment,detect evolution of the cancer, detect cancer recurrence, detect cancerrelapse, or detect cancer progression.

Subsequent samples can be analyzed for a duration of a cancer in asubject. If a recurrence of cancer is identified in a subsequent sample,a second sample can be taken from the subject and sequenced. The secondsample can be a solid sample or fluid sample (e.g., cell-free sample)can be taken from the subject and subjected to sequencing, e.g.,massively parallel sequencing (next generation sequencing) to determinea profile of mutations. In some cases, a second sample is a solid tumorsample, and nucleic acid from the solid tumor sample is sequenced.

Sequencing can detect gene amplification, e.g., at least 50%, 60%, 70%,80%, 90%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, or 100% of geneamplifications tested. Gene amplifications in a sample can be detectedby digital PCR, e.g., ddPCR. Use of ddPCR can detect at least 50%, 60%,70%, 80%, 90%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, or 100% of geneamplifications tested. Gene amplifications can be detected using, e.g.,fluorescent in-situ hybridization (FISH).

Provided herein are compositions and kits for library nucleic acidlibrary formation. The library formation can comprise target capture viaprobe hybridization and extension prior to sequencing. Paired-end readscan be used to align reads from a given probe. FIG. 51 illustrates aworkflow for DNA preparation and library generation; total preparationtime can be about 8 hr. Preparation can include enzymatic manipulationsinterspersed with incubations with Solid Phase Reverse Immoblization(SPRI) beads to purify the nucleic acid intermediate.

DNA from an FFPE sample can be used for library preparation. DNA from anFFPE sample can comprise mutations, e.g., oxoguanine, dUTP, cross-linkedmoieties, and/or abasic sites. Damaged bases can be excised. In somecases, no “corrective” processing steps are involved (base errors notcorrected). Fragments of DNA can be phosphorylated and capped withddNTPs. Single stranded adaptors can be ligated to single stranded DNAfragments from a sample. A double digit yield of adapted DNA fragmentscan be achieved to allow for an improved recovery of sequenceinformation from a sample. In some cases, no whole genome PCR isperformed, which can minimize bias in representation. A process oflibrary preparation can include generation of fragmented DNA, adaptedDNA, target capture, surface loading, and sequencing, with no enrichmentby amplification with primers that amplify fragments with adaptors oneach end of the fragment, of DNA between generation of adapted DNA andtarget capture.

In one example, 3646 capture oligos are used to target 96 genes. Theprobes (capture oligos) can “tile” across each strand of each exon ofeach gene. In some cases, probes (capture oligos) of fixed length areused. In some cases, use of probes of the same length (e.g., 40-mers)can result in difficulties in defining an appropriate hybridizationtemperature (e.g., FIG. 52). FIG. 53 illustrates isoTM probes generatedbased on the equations below: The total enthalpy (ΔH_(tot)) and entropy(ΔS_(tot)) for a given sequence can be determined based on nearestneighbor parameters of SantaLucia and Hicks (2004).

${\Delta \; H_{tot}} = {\sum\limits_{1}^{n - 1}\; {\Delta \; H_{N_{i}N_{i + 1}}}}$${\Delta \; S_{tot}} = {\sum\limits_{1}^{n - 1}\; {\Delta \; S_{N_{i}N_{i + 1}}}}$

The Keq at which fractional binding (fAB) of 0.7 is determined whenAtot=Btot=0.2 uM.

B = f_(AB) ⋅ min (A_(tot), B_(tot))$K_{eq} = \frac{AB}{\left( {A_{tot} - {AB}} \right)\left( {B_{tot} - {AB}} \right)}$

These values are used to determine the temperature at which a fractionalbinding 0.7 for the sequence is expected, using the relationship betweenKeq and free energy (ΔG) and incorporating a salt correction parameterbased on the buffer salt concentrations utilized and their dependence onsequence characteristics:

$T_{{corr},f_{AB}} = {\frac{\Delta \; H_{tot}}{{\Delta \; S_{tot}} - {R\; \ln \; K_{eq}}} + {g_{corr}\left( {{Na}^{+},{Mg}^{2 +},{\% \mspace{14mu} {GC}},n} \right)}}$

Probes can be designed to tile across exons of an entire gene locus(e.g., APC gene) and/or across large genomic distances (e.g., 1.5 Mbencompassing SMAD4 at about 400×).

Hybridization of capture probes to target sequences can be achievedthrough initial heat denaturation of the DNA sample in the presence ofthe capture probes at 95-98° C. for 1 min, followed by slow annealingthrough a decrease in temperature by 1° C./min for 35 min, andincubation at 60-65° C. for 30 min, 1 hr or up to 16 hrs. Followinghybridization, the probe can be extended with Phusion DNA polymerase,and resulting molecules can be expanded with Phusion DNA polymerase. Insome cases, capture does not comprise binding to a solid support (e.g.,streptavidin solid support). Capture probes can comprise about 15 toabout 35 bases that anneal to a target. Each hybridization event canlead directly to library formation, and extension can complete a librarymember. Both strands of the sample DNA can be captured and independentlypooled, and the total incubation time can be about 1 hour. In somecases, PCR is used. In some cases, PCR is minimal or is not used. Insome cases, 80-120 base “bait” probes are not used so that non-specificbinding and/or inter-strand hybridization is minimized.

Libraries can be designed for a MiSeq 600V3 sequencing cartridge (2×250paired-end run in about 2.5 days). Input DNA can be from FFPE, plasma,or fresh-frozen tissue, in some cases up to 300 ng purified DNA. In somecases, 1 ng of input DNA is used. In some cases, 6 samples with uniquebarcodes are used per MiSeq run, with 2 samples allocated for positiveand negative controls.

In some cases, every gene in a panel used is actionable, e.g., druggableor prognostic. In some cases, probes are stored in a flexible format,e.g., probes can be expanded or down-selected as new drugs/targets areidentified. Tiling can be adjusted based on sequencing chemistries.

In some cases, a copy number alteration (CNA) is detected. In cancer,actionable mutations can have the following distribution: rearrangement:3%; truncation: 17%; gene deletion: 8%; gene amplification: 33%;substitution/indel: 8%; mutation hotspots: 31%.

Tests for CNAs can be as described in FIGS. 54 and 55. The totalbasecount per gene <C_(i,j)> can be determined by summing the individualbasecounts C_(i,j) in each segment j (S_(j)) that comprises the gene ofinterest (i) and normalizing by the median basecount across all genesmeasured in the target sample.

${\langle C_{i}\rangle} = \frac{\sum\limits_{j = 1}^{n}\; C_{i,j}}{{median}\left\{ {\sum\limits_{j = 1}^{n}\; C_{i,j}} \right\}_{i = {1\mspace{14mu} \ldots \mspace{14mu} n}}}$

This value can be divided by the corresponding total basecount per genefor a calibrant sample, to derive a log ratio r_(i).

$r_{i} = {\log \frac{\langle C_{i,{tar}}\rangle}{\langle C_{i,{ref}}\rangle}}$

The variance in the log ratios σ² can then be approximated by assuming anormal distribution of log ratios centered at 0.

$\sigma_{r}^{2} \sim {\frac{1}{n - 1}{\sum\limits_{1}^{n}\; \left( {r_{i} - {\frac{1}{n}{\sum\limits_{i = 1}^{n}\; r_{i}}}} \right)^{2}}}$

With this error model an outlier statistic can be derived based on aThompson-Tau outlier test to determine if an observed log ratio for agiven gene falls outside the distribution of log ratios observed for therest of the population at a desired level of significance described bythe z-score when n is sufficiently large.

$\tau = \left. {\frac{t_{\alpha/2} \cdot \left( {n - 1} \right)}{\sqrt{n \cdot \left( {n - 2 + t_{\alpha/2}^{2}} \right)}} \approx {\frac{z \cdot \left( {n - 1} \right)}{\sqrt{n \cdot \left( {n - 2 + z^{2}} \right)}}\mspace{14mu} {for}\mspace{14mu} n}}\rightarrow\infty \right.$

If the magnitude of the distance of a given gene ratio r_(i) from themean is larger than the tau statistic multiple of the standard deviationof the distribution, and greater than 0 then the gene can be defined as“AMPLIFIED”

${r_{i} - {\frac{1}{n}{\sum\limits_{i = 1}^{n}\; r_{i}}}} > {\tau\sigma}$

If the magnitude of the distance of a given gene ratio r_(i) from themean is larger than the negative tau statistic multiple of the standarddeviation of the distribution, and less than 0 then the gene can bedefined as “DELETED”:

${r_{i} - {\frac{1}{n}{\sum\limits_{i = 1}^{n}\; r_{i}}}} < {- {\tau\sigma}}$

FIG. 56 illustrates correlation of measured copy number alterations withexpected copy number alterations (left panel) and measured allelefrequencies with expected allele frequencies (right panel) as recordedin the Broad-Novartis Cancer Cell Line Encyclopedia (CCLE) dataset (16cell lines). FIG. 57 illustrates correlation of quantitative sequencingwith ddPCR. The left panel shows a comparison of ratio calls for 12genes determined with a library formation and DNA sequencing methodprovided herein versus a CLIA validated ddPCR test, which shows a highcorrelation (R²=0.999) between the two orthogonal methods. The bluecomparison includes PCR duplicates, whereas the red excludes PCRduplicates. The right panel shows a comparison of single nucleotidevariants for 96 genes with a library formation and DNA sequencingtechnique versus an array test, which shows a high correlation(R²=0.94329) between the two orthogonal methods. The blue comparisonincludes PCR duplicates, whereas the red excludes PCR duplicates. Noevidence of PCR bias is observed. A 2% limit of detection (LOD) isfound.

Primer/Probe Design

Methods described herein can involve designing primers/probes for use inlibrary formation and/or amplification. For example, 1) a set ofprimers/probes can be designed that fulfill a set of design criteriaacross the entire human genome. For example, primers/probes forsequencing library generation can be as is described, e.g., in Example15 and above. Primer/probe designs can be customized to have a desiredfractional annealing at a given temperature to increase specificity oryield. Primer/probes can be selected based on their sequence composition(GC content, pyrimidine content [e.g., <80% pyrimidine], absence ofhomopolymers >4 bases, di- or tri-nucleotide repeats,) and/or theirthermodynamic properties (free energy of binding versuscross-hybridization versus self-hybridization, unified melting temp of56-60° C.). These features can be parameterized to relate primer/probeperformance as a linear and/or log-linear combination of parametersthrough singular value decomposition (SVD) or neural network to create apredictive model of design success (FIG. 50). Primers/probes can beselected from the set to target a desired set of genes, e.g., genesassociated with all cancers (pan-cancer panel) or genes associated withspecific cancers. In some cases, primer/probe sets can be selected basedon genes mutated in certain types of cancer, e.g., colon cancer, lungcancer, breast cancer, etc. 3) A subset of primers/probes can be used inmethods of targeted sequencing described herein and to identify a set ofmolecular markers/variants that are unique to a tumor (e.g., a“signature”). 4) The identified molecular markers/variants can be usedto determine potential therapies for a subject. 5) Sequences inprimers/probes used in targeted sequencing (#3 above) of nucleic acidfrom a tumor (e.g., from a solid tumor sample), e.g., can be used inprimers/probes to determine a presence or absence of the molecularmarkers/variants within a cell-free DNA component in a fluid sample,e.g., plasma, urine, cerebrospinal fluid (CSF), etc. (“liquid biopsy”).These primers/probes can be used for sequencing or amplification (e.g.,dPCR, e.g., ddPCR) based analysis of a fluid sample. 6) Information for#1 above can be used to create a set of primers to identify a subset ofmarkers identified in #3 above. Methods of designing probes for digitalPCR to discriminate alleles are described herein. 7) A primer/probe setdesigned to fulfill a set of design criteria across an entire genome(#1) can be altered in that the ultimate 3′ base (i.e. the 3′-most base)of primers/probes can overlay a single nucleotide variant (SNV)identified in target sequncing (#3 above), to design allele-specificprimers for universal probe assays for ddPCR, using design criteria thatmaximize discrimination of SNVs from their normal variant. 8. The assaysdesigned in #6 & #7 can be used to monitor the treatment efficacy overtime. 9) The assays can also include designs for primers/probes toanalyze specific genes, e.g., TP53.

A probe for sensitive detection of amplicons can be designed for highlysensitive exon discrimination, e.g., can be an exon-specific probe. Suchprobes can be designed to partially or fully overlay an exon-specificlocus. An exon-specific probe can be designed to be inactive on a secondexon. In some embodiments, these probes are designed for a duplexreaction in digital PCR.

A probe for sensitive detection of amplicons can be designed for highlysensitive gene-specific discrimination, e.g., can be a gene-specificprobe. Such probes can be designed to partially, fully overlay intronicor exonic sequences from component intron and exon sequences within 1 or2 or more genes or a gene-component-specific locus. A gene-specificprobe can be designed to be inactive on a second gene-specific locus ora locus containing a combination of components from 2 or more genes. Insome embodiments, these probes are designed for a duplex reaction indigital PCR.

In some cases, a condition is monitored via dPCR or sequencing byfollowing a plurality of variants (base changes, or indels, ormethylation, or any combination) that can correspond to a cancer in anaggregated manner, rather than investigating the nature of specificmarkers.

FIG. 1 depicts an exemplary workflow of a method for assessing cancer.In step 110, the method comprises sequencing cancer-related genes from atumor sample isolated from said subject and optionally sequencing a setof cancer-related genes from normal cells isolated from said subject.The tumor sample can be a solid tumor sample. The normal cells can beblood cells isolated from a blood sample from the subject or a cheekswab. In step 120, sequence data from the tumor can be used to determinea tumor-specific sequence profile. In some embodiments, sequence datafrom the tumor is compared to sequence data from normal cells togenerate the tumor-specific sequence profile. In some embodiments, thetumor-specific sequence profile comprises mutational status of one ormore genes in the set. The method can further comprise generating areport describing the tumor-specific sequence profile. In someembodiments, the method further comprises choosing a subset of 2-4 genesknown to harbor tumor-specific mutations for further monitoring. In someembodiments, the method comprises choosing a subset of no more than 4genes known to harbor tumor-specific mutations. In step 130, cell-freeDNA is obtained from a blood sample collected from the subject prior totreatment (e.g., tumor removal or therapeutic intervention) as well asprior to treatment (tumor removal or therapeutic intervention) as wellas at a later time point. In step 140, the cell-free DNA from the bloodsample is assayed for the 2-4 genes in the subset to obtain quantitativemeasurement of the tumor-specific mutations.

FIG. 2 is a depiction of an exemplary workflow of a method as describedin FIG. 1, from steps 110-120, for sequencing a tumor cell and a normalcell in a subject.

The tumor sample can be processed prior to sequencing by fixation in aformalin solution, followed by embedding in paraffin (e.g., is a FFPEsample). In some embodiments, the tumor sample is frozen prior tosequencing. In some embodiments, the tumor sample is neither fixed norfrozen. The unfixed, unfrozen tumor sample can be stored in a storagesolution configured for the preservation of nucleic acid at roomtemperature. The storage solution can be a commercially availablestorage solution. Exemplary storage solutions include, but are notlimited to, DNA storage solutions from Biomatrica (see, e.g.,WO/2012/018638, WO/2009/038853, US20080176209), hereby incorporated byreference.

Further embodiments of the sequencing methods and assays for determiningmutational status in the blood are described herein.

Next-Generation Sequencing

In some embodiments, the tumor sample and normal cells from the subjectare sequenced. In some embodiments, nucleic acid is isolated from thetumor sample and normal cells using any methods known in the art. Thenucleic acid is DNA. The DNA from the tumor sample and normal cells canbe used to prepare a subject-specific tumor DNA library and/or normalDNA library. DNA libraries can be used for sequencing by a sequencingplatform. The sequencing platform can be a next-generation sequencing(NGS) platform. In some embodiments, the method further comprisessequencing the nucleic acid libraries using NGS technology. NGStechnology can involve sequencing of clonally amplified DNA templates orsingle DNA molecules in a massively parallel fashion (e.g. as describedin Volkerding et al. Clin Chem 55:641-658 [2009]; Metzker M Nature Rev11:31-46 [2010]). In addition to high-throughput sequence information,NGS provides digital quantitative information, in that each sequenceread is a countable “sequence tag” representing an individual clonal DNAtemplate or a single DNA molecule.

Next Generation Sequencing Platforms

The next-generation sequencing platform can be a commercially availableplatform. Commercially available platforms include, e.g., platforms forsequencing-by-synthesis, ion semiconductor sequencing, pyrosequencing,reversible dye terminator sequencing, sequencing by ligation,single-molecule sequencing, sequencing by hybridization, and nanoporesequencing. Platforms for sequencing by synthesis are available from,e.g., Illumina, 454 Life Sciences, Helicos Biosciences, and Qiagen.Illumina platforms can include, e.g., Illumina's Solexa platform,Illumina's Genome Analyzer, and are described in Gudmundsson et al (Nat.Genet. 2009 41:1122-6), Out et al (Hum. Mutat. 2009 30:1703-12) andTurner (Nat. Methods 2009 6:315-6), U.S. Patent Application Pub nos.US20080160580 and US20080286795, U.S. Pat. Nos. 6,306,597, 7,115,400,and 7,232,656. 454 Life Science platforms include, e.g., the GS Flex andGS Junior, and are described in U.S. Pat. No. 7,323,305. Platforms fromHelicos Biosciences include the True Single Molecule Sequencingplatform. Platforms for ion semiconductor sequencing include, e.g., theIon Torrent Personal Genome Machine (PGM) and are described in U.S. Pat.No. 7,948,015. Platforms for pyrosequencing include the GS Flex 454system and are described in U.S. Pat. Nos. 7,211,390; 7,244,559;7,264,929. Platforms and methods for sequencing by ligation include,e.g., the SOLiD sequencing platform and are described in U.S. Pat. No.5,750,341. Platforms for single-molecule sequencing include the SMRTsystem from Pacific Bioscience and the Helicos True Single MoleculeSequencing platform.

While the automated Sanger method is considered as a ‘first generation’technology, Sanger sequencing including the automated Sanger sequencing,can also be employed by the method of the disclosure. Additionalsequencing methods that comprise the use of developing nucleic acidimaging technologies e.g. atomic force microscopy (AFM) or transmissionelectron microscopy (TEM), are also encompassed by the method of thedisclosure. Exemplary sequencing technologies are described below.

The DNA sequencing technology can utilize the Ion Torrent sequencingplatform, which pairs semiconductor technology with a sequencingchemistry to directly translate chemically encoded information (A, C, G,T) into digital information (0, 1) on a semiconductor chip. Withoutwishing to be bound by theory, when a nucleotide is incorporated into astrand of DNA by a polymerase, a hydrogen ion is released as abyproduct. The Ion Torrent platform detects the release of the hydrogenatom as a change in pH. A detected change in pH can be used to indicatenucleotide incorporation. The Ion Torrent platform comprises ahigh-density array of micro-machined wells to perform this biochemicalprocess in a massively parallel way. Each well holds a different librarymember, which may be clonally amplified. Beneath the wells is anion-sensitive layer and beneath that an ion sensor. The platformsequentially floods the array with one nucleotide after another. When anucleotide, for example a C, is added to a DNA template and is thenincorporated into a strand of DNA, a hydrogen ion will be released. Thecharge from that ion will change the pH of the solution, which can beidentified by Ion Torrent's ion sensor. If the nucleotide is notincorporated, no voltage change will be recorded and no base will becalled. If there are two identical bases on the DNA strand, the voltagewill be double, and the chip will record two identical bases called.Direct identification allows recordation of nucleotide incorporation inseconds. Library preparation for the Ion Torrent platform generallyinvolves ligation of two distinct adaptors at both ends of a DNAfragment.

The DNA sequencing technology utilizes an Illumina sequencing platform,which generally employs cluster amplification of library members onto aflow cell and a sequencing-by-synthesis approach. Cluster-amplifiedlibrary members are subjected to repeated cycles of polymerase-directedsingle base extension. Single-base extension can involve incorporationof reversible-terminator dNTPs, each dNTP labeled with a differentremovable fluorophore. The reversible-terminator dNTPs are generally 3′modified to prevent further extension by the polymerase. Afterincorporation, the incorporated nucleotide can be identified byfluorescence imaging. Following fluorescence imaging, the fluorophorecan be removed and the 3′ modification can be removed resulting in a 3′hydroxyl group, thereby allowing another cycle of single base extension.Library preparation for the Illumina platform generally involvesligation of two distinct adaptors at both ends of a DNA fragment.

The DNA sequencing technology that is used in one or more methods of thedisclosure can be the Helicos True Single Molecule Sequencing (tSMS),which can employ sequencing-by-synthesis technology. In the tSMStechnique, a polyA adaptor can be ligated to the 3′ end of DNAfragments. The adapted fragments can be hybridized to poly-Toligonucleotides immobilized on the tSMS flow cell. The library memberscan be immobilized onto the flow cell at a density of about 100 milliontemplates/cm2. The flow cell can be then loaded into an instrument,e.g., HeliScope™ sequencer, and a laser can illuminate the surface ofthe flow cell, revealing the position of each template. A CCD camera canmap the position of the templates on the flow cell surface. The librarymembers can be subjected to repeated cycles of polymerase-directedsingle base extension. The sequencing reaction begins by introducing aDNA polymerase and a fluorescently labeled nucleotide. The polymerasecan incorporate the labeled nucleotides to the primer in a templatedirected manner. The polymerase and unincorporated nucleotides can beremoved. The templates that have directed incorporation of thefluorescently labeled nucleotide can be discerned by imaging the flowcell surface. After imaging, a cleavage step can remove the fluorescentlabel, and the process can be repeated with other fluorescently labelednucleotides until a desired read length is achieved. Sequenceinformation can be collected with each nucleotide addition step.

The DNA sequencing technology can utilize a 454 sequencing platform(Roche) (e.g. as described in Margulies, M. et al. Nature 437:376-380[2005]). 454 sequencing generally involves two steps. In a first step,DNA can be sheared into fragments. The fragments can be blunt-ended.Oligonucleotide adaptors can be ligated to the ends of the fragments.The adaptors generally serve as primers for amplification and sequencingof the fragments. At least one adaptor can comprise a capture reagent,e.g., a biotin. The fragments can be attached to DNA capture beads,e.g., streptavidin-coated beads. The fragments attached to the beads canbe PCR amplified within droplets of an oil-water emulsion, resulting inmultiple copies of clonally amplified DNA fragments on each bead. In asecond step, the beads can be captured in wells, which can be pico-litersized. Pyrosequencing can be performed on each DNA fragment in parallel.Pyrosequencing generally detects release of pyrophosphate (PPi) uponnucleotide incorporation. PPi can be converted to ATP by ATP sulfurylasein the presence of adenosine 5′ phosphosulfate. Luciferase can use ATPto convert luciferin to oxyluciferin, thereby generating a light signalthat is detected. A detected light signal can be used to identify theincorporated nucleotide.

The DNA sequencing technology can utilize a SOLiD™ technology (AppliedBiosystems). The SOLiD platform generally utilizes asequencing-by-ligation approach. Library preparation for use with aSOLiD platform generally comprises ligation of adaptors are attached tothe 5′ and 3′ ends of the fragments to generate a fragment library.Alternatively, internal adaptors can be introduced by ligating adaptorsto the 5′ and 3′ ends of the fragments, circularizing the fragments,digesting the circularized fragment to generate an internal adaptor, andattaching adaptors to the 5′ and 3′ ends of the resulting fragments togenerate a mate-paired library. Next, clonal bead populations can beprepared in microreactors containing beads, primers, template, and PCRcomponents. Following PCR, the templates can be denatured. Beads can beenriched for beads with extended templates. Templates on the selectedbeads can be subjected to a 3′ modification that permits bonding to aglass slide. The sequence can be determined by sequential hybridizationand ligation of partially random oligonucleotides with a centraldetermined base (or pair of bases) that is identified by a specificfluorophore. After a color is recorded, the ligated oligonucleotide canbe removed and the process can then be repeated.

The DNA sequencing technology can utilize a single molecule, real-time(SMRT™) sequencing platform (Pacific Biosciences). In SMRT sequencing,the continuous incorporation of dye-labeled nucleotides can be imagedduring DNA synthesis. Single DNA polymerase molecules can be attached tothe bottom surface of individual zero-mode wavelength identifiers (ZMWidentifiers) that obtain sequence information while phospolinkednucleotides are being incorporated into the growing primer strand. A ZMWgenerally refers to a confinement structure which enables observation ofincorporation of a single nucleotide by DNA polymerase against abackground of fluorescent nucleotides that rapidly diffuse in an out ofthe ZMW on a microsecond scale. By contrast, incorporation of anucleotide generally occurs on a milliseconds timescale. During thistime, the fluorescent label can be excited to produce a fluorescentsignal, which is detected. Detection of the fluorescent signal can beused to generate sequence information. The fluorophore can then beremoved, and the process repeated. Library preparation for the SMRTplatform generally involves ligation of hairpin adaptors to the ends ofDNA fragments.

The DNA sequencing technology can utilize nanopore sequencing (e.g. asdescribed in Soni G V and Meller A. Clin Chem 53: 1996-2001 [2007]).Nanopore sequencing DNA analysis techniques are being industriallydeveloped by a number of companies, including Oxford NanoporeTechnologies (Oxford, United Kingdom). Nanopore sequencing is asingle-molecule sequencing technology whereby a single molecule of DNAis sequenced directly as it passes through a nanopore. A nanopore can bea small hole, of the order of 1 nanometer in diameter. Immersion of ananopore in a conducting fluid and application of a potential (voltage)across can result in a slight electrical current due to conduction ofions through the nanopore. The amount of current which flows issensitive to the size and shape of the nanopore and to occlusion by,e.g., a DNA molecule. As a DNA molecule passes through a nanopore, eachnucleotide on the DNA molecule obstructs the nanopore to a differentdegree, changing the magnitude of the current through the nanopore indifferent degrees. Thus, this change in the current as the DNA moleculepasses through the nanopore represents a reading of the DNA sequence.

The DNA sequencing technology can utilize a chemical-sensitive fieldeffect transistor (chemFET) array (e.g., as described in U.S. PatentApplication Publication No. 20090026082). In one example of thetechnique, DNA molecules can be placed into reaction chambers, and thetemplate molecules can be hybridized to a sequencing primer bound to apolymerase. Incorporation of one or more triphosphates into a newnucleic acid strand at the 3′ end of the sequencing primer can bediscerned by a change in current by a chemFET. An array can havemultiple chemFET sensors. In another example, single nucleic acids canbe attached to beads, and the nucleic acids can be amplified on thebead, and the individual beads can be transferred to individual reactionchambers on a chemFET array, with each chamber having a chemFET sensor,and the nucleic acids can be sequenced.

The DNA sequencing technology can utilize transmission electronmicroscopy (TEM). The method, termed Individual Molecule Placement RapidNano Transfer (IMPRNT), generally comprises single atom resolutiontransmission electron microscope imaging of high-molecular weight (150kb or greater) DNA selectively labeled with heavy atom markers andarranging these molecules on ultra-thin films in ultra-dense (3 nmstrand-to-strand) parallel arrays with consistent base-to-base spacing.The electron microscope is used to image the molecules on the films todetermine the position of the heavy atom markers and to extract basesequence information from the DNA. The method is further described inPCT patent publication WO 2009/046445. The method allows for sequencingcomplete human genomes in less than ten minutes.

The method can utilize sequencing by hybridization (SBH). SBH generallycomprises contacting a plurality of polynucleotide sequences with aplurality of polynucleotide probes, wherein each of the plurality ofpolynucleotide probes can be optionally tethered to a substrate. Thesubstrate might be flat surface comprising an array of known nucleotidesequences. The pattern of hybridization to the array can be used todetermine the polynucleotide sequences present in the sample. In otherembodiments, each probe is tethered to a bead, e.g., a magnetic bead orthe like. Hybridization to the beads can be identified and used toidentify the plurality of polynucleotide sequences within the sample.

The length of the sequence read can vary depending on the particularsequencing technology utilized. NGS platforms can provide sequence readsthat vary in size from tens to hundreds, or thousands of base pairs. Insome embodiments of the method described herein, the sequence reads areabout 20 bases long, about 25 bases long, about 30 bases long, about 35bases long, about 40 bases long, about 45 bases long, about 50 baseslong, about 55 bases long, about 60 bases long, about 65 bases long,about 70 bases long, about 75 bases long, about 80 bases long, about 85bases long, about 90 bases long, about 95 bases long, about 100 baseslong, about 110 bases long, about 120 bases long, about 130, about 140bases long, about 150 bases long, about 200 bases long, about 250 baseslong, about 300 bases long, about 350 bases long, about 400 bases long,about 450 bases long, about 500 bases long, about 600 bases long, about700 bases long, about 800 bases long, about 900 bases long, about 1000bases long, or more than 1000 bases long.

Partial sequencing of DNA fragments present in the sample can beperformed, and sequence tags comprising reads that map to a knownreference genome can be counted. Only sequence reads that uniquely alignto the reference genome can be counted as sequence tags. In oneembodiment, the reference genome is the human reference genomeNCBI36/hg18 sequence, which is available on the world wide web atgenome.ucsc.edu/cgi-bin/hgGateway?org=Human&db=hgl 8&hgsid=166260105).Other sources of public sequence information include GenBank, dbEST,dbSTS, EMBL (the European Molecular Biology Laboratory), and the DDBJ(the DNA Databank of Japan). The reference genome can also comprise thehuman reference genome NCBI36/hgl 8 sequence and an artificial targetsequences genome, which includes polymorphic target sequences. In yetanother embodiment, the reference genome is an artificial targetsequence genome comprising polymorphic target sequences.

Mapping of the sequence tags can be achieved by comparing the sequenceof the tag with the sequence of the reference genome to determine thechromosomal origin of the sequenced nucleic acid (e.g. cell free DNA)molecule, and specific genetic sequence information is not needed. Anumber of computer algorithms are available for aligning sequences,including without limitation BLAST (Altschul et al., 1990), BLITZ(MPsrch) (Sturrock & Collins, 1993), FASTA (Person & Lipman, 1988),BOWTIE (Langmead et al, Genome Biology 10:R25.1-R25.10 [2009]), or ELAND(Illumina, Inc., San Diego, Calif., USA). In one embodiment, one end ofthe clonally expanded copies of the DNA molecule is sequenced andprocessed by bioinformatic alignment analysis for the Illumina GenomeAnalyzer, which uses the Efficient Large-Scale Alignment of NucleotideDatabases (ELAND) software. Additional software includes SAMtools(SAMtools, Bioinformatics, 2009, 25(16):2078-9), and theBurroughs-Wheeler block sorting compression procedure which involvesblock sorting or preprocessing to make compression more efficient.

The sequencing platforms described herein generally comprise a solidsupport immobilized thereon surface-bound oligonucleotides which allowfor the capture and immobilization of sequencing library members to thesolid support. Surface bound oligonucleotides generally comprisesequences complementary to the adaptor sequences of the sequencinglibrary.

Nucleic acid samples can be used to prepare nucleic acid libraries forsequencing. Preparation of nucleic acid libraries can comprise anymethod known in the art or as described herein. As used herein, theterms “library” or “sequencing library” are used interchangeably hereinand can refer to a plurality of nucleic acid fragments obtained from abiological sample. Generally, the fragments are modified with an adaptorsequence which affects coupling (e.g., capture and/or immobilization) ofthe fragments to a sequencing platform. An adaptor sequence can comprisea defined oligonucleotide sequence that affects coupling of a librarymember to a sequencing platform. By way of example only, the adaptor cancomprise a sequence that is at least 25% complementary or identical toan oligonucleotide sequence immobilized onto a solid support (e.g., asequencing flow cell or bead). An adaptor sequence can comprise adefined oligonucleotide sequence that is at least 70% complementary oridentical to a sequencing primer. The sequencing primer can enablenucleotide incorporation by a polymerase, wherein incorporation of thenucleotide is monitored to provide sequencing information. Thesequencing primer can be about 15-25 bases. In some embodiments, thesequencing primer is conjugated to the 3′ end of the adaptor. In someembodiments, an adaptor comprises a sequence that is at least 25%complementary or identical to an oligonucleotide sequence immobilizedonto a solid support and a sequence that is at least 70% complementaryor identical to a sequencing primer. Coupling can also be achievedthrough serially stitching adaptors together. The number of adaptorsthat can be stitched can be 1, 2, 3, 4 or more. The stitched adaptorscan be at least 35 bases, 70 bases, 105 bases, 140 bases or more.

The adaptor can comprise a barcode sequence. At least 50%, 60%, 70%,80%, 90%, or 100% of sequencing library members in a library cancomprise the same adaptor sequence. At least 50%, 60%, 70%, 80%, 90%, or100% of the ssDNA library members can comprise an adaptor sequence at afirst end but not at a second end. In some embodiments, the first end isa 5′ end. In some embodiments, the first end is at 3′ end. The adaptorsequence can be chosen by a user according to the sequencing platformused for sequencing. By way of example only, an Illumina sequencing bysynthesis platform comprises a solid support with a first and secondpopulation of surface-bound oligonucleotides immobilized thereon. Sucholigonucleotides comprise a sequence for hybridizing to a first andsecond Illumina-specific adaptor oligonucleotide and priming anextension reaction. Accordingly, a DNA library member can comprise afirst Illumina-specific adaptor that is partially or whollycomplementary to a first population of surface bound oligonucleotides ofan Illumina system. By way of other example only, the SOLiD system, andIon Torrent, GS FLEX system comprises a solid support in the form of abead with a single population of surface bound oligonucleotidesimmobilized thereon. Accordingly, in some embodiments the ssDNA librarymember comprises an adaptor sequence that is complementary to asurface-bound oligonucleotide of a SOLiD system, Ion Torrent system, orGS Flex system.

Accordingly, in one aspect, the disclosure provides improved methods ofpreparing a nucleic acid library. The nucleic acid library can be a DNAlibrary. The method can comprise ligation of adaptor sequences to DNAfragments. The method can improve efficiency of adaptor ligation by atleast 10-fold. In some embodiments, the nucleic acid library is a ssDNAlibrary. In some embodiments, the nucleic acid library is a partialssDNA library. In some embodiments, the nucleic acid library is a doublestranded (dsDNA) library.

ssDNA Fragment/ssDNA Library Preparation

In some embodiments, the ssDNA fragment is a member of a ssDNA library.The single-stranded nucleic acid library is prepared from a sample ofdouble-stranded nucleic acid using any means known in the art ordescribed herein.

The starting sample can be a biological sample obtained from a subject.Exemplary subjects and biological samples are described herein. Inparticular embodiments, the sample is a solid biological sample, e.g., atumor sample. In some embodiments, the solid biological sample isprocessed prior to the probe-based assay. Processing can comprisefixation in a formalin solution, followed by embedding in paraffin(e.g., is a FFPE sample). Processing can alternatively comprise freezingof the sample prior to conducting the probe-based assay. In someembodiments, the sample is neither fixed nor frozen. The unfixed,unfrozen sample can be, by way of example only, stored in a storagesolution configured for the preservation of nucleic acid. Exemplarystorage solutions are described herein. In some embodiments, non-nucleicacid materials can be removed from the starting material using enzymatictreatments (for example, with a protease). The sample can optionally besubjected to homogenization, sonication, French press, dounce,freeze/thaw, which can be followed by centrifugation. The centrifugationmay separate nucleic acid-containing fractions from non-nucleicacid-containing fractions. In some embodiments, the sample is a liquidbiological sample. Exemplary liquid biological samples are describedherein. In some embodiments, the liquid biological sample is a bloodsample (e.g., whole blood, plasma, or serum). In some embodiments, awhole blood sample is subjected to acellular components (e.g., plasma,serum) and cellular components by use of a Ficoll reagentm described indetail Fuss et al, Curr Protoc Immunol (2009) Chapter 7:Unit7.1, whichis incorporated herein by reference.

Nucleic acid can be isolated from the biological sample using any meansknown in the art. For example, nucleic acid can be extracted from thebiological sample using liquid extraction (e.g., Trizol, DNAzol)techniques. Nucleic acid can also be extracted using commerciallyavailable kits (e.g., Qiagen DNeasy kit, QIAamp kit, Qiagen Midi kit,QIAprep spin kit).

Nucleic acid can be concentrated by known methods, including, by way ofexample only, centrifugation. Nucleic acid can be bound to a selectivemembrane (e.g., silica) for the purposes of purification. Nucleic acidcan also be enriched for fragments of a desired length, e.g., fragmentswhich are less than 1000, 500, 400, 300, 200 or 100 base pairs inlength. Such an enrichment based on size can be performed using, e.g.,PEG-induced precipitation, an electrophoretic gel or chromatographymaterial (Huber et al. (1993) Nucleic Acids Res. 21:1061-6), gelfiltration chromatography, TSK gel (Kato et al. (1984) J. Biochem,95:83-86), which publications are hereby incorporated by reference.

Polynucleotides extracted from a biological sample can be selectivelyprecipitated or concentrated using any methods known in the art.

The nucleic acid sample can be enriched for target polynucleotides.Target enrichment can be by any means known in the art. For example, thenucleic acid sample may be enriched by amplifying target sequences usingtarget-specific primers. The target amplification can occur in a digitalPCR format, using any methods or systems known in the art. The nucleicacid sample may be enriched by capture of target sequences onto an arrayimmobilized thereon target-selective oligonucleotides. The nucleic acidsample may be enriched by hybridizing to target-selectiveoligonucleotides free in solution or on a solid support. Theoligonucleotides may comprise a capture moiety which enables capture bya capture reagent. Exemplary capture moieties and capture reagents aredescribed herein. In some embodiments, the nucleic acid sample is notenriched for target polynucleotides, e.g., represents a whole genome.

Accordingly, in some aspects the disclosure provides a method ofpreparing a single-stranded nucleic acid library. The single-strandednucleic acid library can be a single-stranded DNA library (ssDNAlibrary) or an RNA library. A method of preparing an ssDNA library cancomprise denaturing a double stranded DNA fragment into ssDNA fragments,ligating a primer docking sequence onto one end of the ssDNA fragment,hybridizing a primer to the primer docking sequence. The primer cancomprise at least a portion of an adaptor sequence that couples to anext-generation sequencing platform. The method can further compriseextension of the hybridized primer to create a duplex, wherein theduplex comprises the original ssDNA fragment and an extended primerstrand. The extended primer strand can be separated from the originalssDNA fragment. The extended primer strand can be collected, wherein theextended primer strand is a member of the ssDNA library. A method ofpreparing an RNA library can comprise ligating a primer docking sequenceonto one end of the RNA fragment, hybridizing a primer to the primerdocking sequence. The primer can comprise at least a portion of anadaptor sequence that couples to a next-generation sequencing platform.The method can further comprise extension of the hybridized primer tocreate a duplex, wherein the duplex comprises the original RNA fragmentand an extended primer strand. The extended primer strand can beseparated from the original RNA fragment. The extended primer strand canbe collected, wherein the extended primer strand is a member of the RNAlibrary.

In some aspects provided herein is a method of preparing adouble-stranded nucleic acid library. The double-stranded nucleic acidlibrary can be a cDNA library or a genomic DNA library. A method ofpreparing a dsDNA library can comprise fragmenting double stranded DNAinto dsDNA fragments. In some cases, the dsDNA (e.g., cell-free dsDNA)is not subjected to a fragmentation step. In some cases, an adaptor isligated to the dsDNA or dsDNA fragment. An adaptor can be ligated to oneend of the dsDNA or dsDNA fragments or both ends of the dsDNA or dsDNAfragments. An adaptor can be ligated to a 5′ end of the dsDNA or dsDNAfragment, 3′ end of the dsDNA or dsDNA fragment, or both a 5′ end and a3′ end of the dsDNA or dsDNA fragment. In some case, an adaptor isconfigured such that it is not capable of ligating to a 5′ end or 3′ endof the dsDNA or dsDNA fragment. The adaptor can comprise sequence forannealing to a primer, e.g., an amplification primer. In some cases, adsDNA library comprising dsDNA with adaptors at both ends is amplifiedusing primers that anneal to the adaptors. In some cases, a dsDNAlibrary comprising dsDNA with adaptors at both ends is not amplifiedusing primers that anneal to the adaptors.

Members of a dsDNA library can be denatured. A target specific primercan be annealed to a target sequence in the denatured dsDNA library. Thetarget specific primer can comprise a 3′ end with that anneals to aspecific target sequence and a 5′ end that does not anneal to targetsequence. The 5′ end can comprise a second adaptor sequence. The secondadaptor sequence can be different than adaptor sequence ligated to thedsDNA library. The target specific primer annealed to the targetsequence can be extended to generate a primer extension product. Theprimer extension product can be annealed to the target sequencefollowing extension. The primer extension product/target sequence hybridcan be denatured to form single stranded primer extension product. Theprimer extension product can be amplified, e.g., using a primer thatanneals to adaptor sequence used in ligation and primer sequence thatanneals to the complement of the adaptor sequence at the 5′ end of thetarget specific primer.

In various aspects, dsDNA can be fragmented by any means known in theart or as described herein. dsDNA can be fragmented by physical means,for example, by mechanical shearing, by nebulization, or by sonication;by chemical means, such as treatment with Fe(II)-EDTA chelate; or byenzymatic means, such as a plurality of nicking enzymes, restrictionenzymes, or fragmentases (NEB).

In some embodiments, cDNA is generated from RNA using random primedreverse transcription (RNaseH+) to generate randomly sized cDNA.

The nucleic acid fragments (e.g., dsDNA fragments, RNA, or randomlysized cDNA) can be less than 1000 bp, less than 800 bp, less than 700bp, less than 600 bp, less than 500 bp, less than 400 bp, less than 300bp, less than 200 bp, or less than 100 bp. The DNA fragments can beabout 40 to about 100 bp, about 50 to about 125 bp, about 100 to about200 bp, about 150 to about 400 bp, about 300 to about 500 bp, about 100to about 500, about 400 to about 700 bp, about 500 to about 800 bp,about 700 to about 900 bp, about 800 to about 1000 bp, or about 100 toabout 1000 bp.

The ends of dsDNA fragments can be polished (e.g., blunt-ended). Theends of DNA fragments can be polished by treatment with a polymerase.Polishing can involve removal of 3′ overhangs, fill-in of 5′ overhangs,or a combination thereof. The polymerase can be a proof-readingpolymerase (e.g., comprising 3′ to 5′ exonuclease activity). Theproofreading polymerase can be, e.g., a T4 DNA polymerase, Pol 1 Klenowfragment, or Pfu polymerase. Polishing can comprise removal of damagednucleotides (e.g. abasic sites), using any means known in the art.

Ligation of an adaptor to a 3′ end of a nucleic acid fragment cancomprise formation of a bond between a 3′ OH group of the fragment and a5′ phosphate of the adaptor. Therefore, removal of 5′ phosphates fromnucleic acid fragments can minimize aberrant ligation of two librarymembers. Accordingly, in some embodiments, 5′ phosphates are removedfrom nucleic acid fragments. In some embodiments, 5′ phosphates areremoved from at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%,or greater than 95% of nucleic acid fragments in a sample. In someembodiments, substantially all phosphate groups are removed from nucleicacid fragments. In some embodiments, substantially all phosphates areremoved from at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%,or greater than 95% of nucleic acid fragments in a sample. Removal ofphosphate groups from a nucleic acid sample can be by any means known inthe art. Removal of phosphate groups can comprise treating the samplewith heat-labile phosphatase. In some embodiments, phosphate groups arenot removed from the nucleic acid sample. In some embodiments ligationof an adaptor to the 5′ end of the nucleic acid fragment is performed.

Denaturation

ssDNA can be prepared from dsDNA fragments prepared by any means in theart or as described herein, by denaturation into single strands.Denaturation of dsDNA can be by any means known in the art, includingheat denaturation, incubation in basic pH, and denaturation by urea orformaldehyde.

Heat denaturation can be achieved by heating a dsDNA sample to about 60°° C. or above, about 65° C. or above, about 70° C. or above, about 75°C. or above, about 80° C. or above, about 85° C. or above, about 90° C.or above, about 95° C. or above, or about 98° C. or above. The dsDNAsample can be heated by any means known in the art, including, e.g.,incubation in a water bath, a temperature controlled heat block, athermal cycler. In some embodiments the sample is heated for 0.5, 1, 2,3, 4, 5, 6, 7, 8, 9, 10, or more than 10 minutes.

Denaturation by incubation in basic pH can be achieved by, for example,incubation of a dsDNA sample in a solution comprising sodium hydroxide(NaOH) or potassium hydroxide (KOH). In some embodiments, denaturationis achieved by incubation in basic pH at about pH 7.1, pH 7.2, pH 7.3,pH 7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH 7.9, pH 8, pH 8.1, pH 8.2, pH8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH 8.8, pH 8.9, pH 9, pH 9.5, pH10, pH 10.5, pH 11, pH 11.5, pH 12, pH 12.5, pH 13, or greater. In someembodiments, denaturation is achieved by incubation in basic pH close toneutral. In some embodiments, denaturation is achieved by incubation inbasic pH about pH 7.5 to about pH 9, about pH 8 to about pH 10, or aboutpH 7 to about pH 8. The solution can comprise about 1 mM NAOH, 2 mMNAOH, 5 mM NAOH, 10 mM NAOH, 20 mM NAOH, 40 mM NAOH, 60 mM NAOH, 80 mMNAOH, 100 mM NAOH, 0.2M NaOH, about 0.3M NaOH, about 0.4M NaOH, about0.5M NaOH, about 0.6M NaOH, about 0.7M NaOH, about 0.8M NaOH, about 0.9MNaOH, about 1.0M NaOH, or greater than 1.0M NaOH. The solution cancomprise about 1 mM KOH, 2 mM KOH, 5 mM KOH, 10 mM KOH, 20 mM KOH, 40 mMKOH, 60 mM KOH, 80 mM KOH, 100 mM KOH, 0.2M KOH, 0.5M KOH, 1M KOH, orgreater than 1M KOH. In some embodiments, the dsDNA sample is incubatedin NaOH or KOH for 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40,50, 60, or more than 60 minutes. The dsDNA can be incubated with sodiumor ammonium salts of acetic acid, or acetic acid following NaOH or KOHincubation to neutralize the alkaline solution.

Compounds like urea and formamide contain functional groups that canform H-bonds with the electronegative centers of the nucleotide bases.At high concentrations (e.g., 8M urea or 70% formamide) of thedenaturant, the competition for H-bonds favors interactions between thedenaturant and the N-bases rather than between complementary bases,thereby separating the two strands.

Ligation of Primer-Docking Oligonucleotide.

A primer-docking oligonucleotide (pdo) can be ligated onto one end of anucleic acid fragment (e.g., ssDNA, RNA, dsDNA). The pdo can be ligatedonto a 5′ end or a 3′ end. In some embodiments, the pdo is ligated ontoa 3′ end of the nucleic acid fragment.

The pdo generally comprises a sequence that acts as a template forannealing a primer. The sequence of the pdo can comprise a sequence thatis at least 70% complementary to a portion or all of an adaptor sequencefor coupling to an NGS platform (NGS adaptor). The pdo can comprise asequence complementary or identical to 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15, 20, or more than 20 contiguous nucleotides of an NGS adaptor. Insome embodiments, the pdo does not comprise a sequence complementary toa portion or all of an NGS adaptor.

The pdo can be adenylated at a 5′ end. The pdo can be conjugated to acapture moiety that is capable of forming a complex with a capturereagent. The capture moiety can be conjugated to the adaptoroligonucleotide by any means known in the art. Capture moiety/capturereagent pairs are known in the art. In some embodiments the capturereagent is avidin, streptavidin, or neutravidin and the capture moietyis biotin. In another embodiment the capture moiety/capture reagent pairis digoxigenin/wheat germ agglutinin.

Ligation of the pdo to the nucleic acid fragment can be effected by anATP-dependent ligase. In some embodiments, the ATP-dependent ligase isan RNA ligase. The RNA ligase can be an ATP dependent ligase. The RNAligase can be an Rnl 1 or Rnl 2 family ligase. Generally, Rnl 1 familyligases can repair single-stranded breaks in tRNA. Exemplary Rnl 1family ligases include, e.g., T4 RNA ligase, thermostable RNA ligase 1from Thermus scitoductus bacteriophage TS2126 (CircLigase), orCircLigase II. These ligases generally catalyze the ATP-dependentformation of a phosphodiester bond between a nucleotide 3-OH nucleophileand a 5′ phosphate group. Generally, Rnl 2 family ligases can seal nicksin duplex RNAs. Exemplary Rnl 2 family ligases include, e.g., T4 RNAligase 2. The RNA ligase can be an Archaeal RNA ligase, e.g., anarchaeal RNA ligase from the thermophilic archaeon Methanobacteriumthermoautotrophicum (MthRnl).

The ligation of the pdo's to the single-stranded nucleic acid fragmentcan comprise preparing a reaction mixture comprising an nucleic acidfragment, a pdo, and ligase. In some embodiments the reaction mixture isheated to effect ligation of the adaptor oligonucleotides to the ss DNAfragments. In some embodiments the reaction mixture is heated to about50° C., about 55° C., about 60° C., about 65° C., about 70° C., or above70° C. In some embodiments the reaction mixture is heated to about60-70° C. The reaction mixture can be heated for a sufficient time toeffect ligation of the pdo to the nucleic acid fragment. In someembodiments, the reaction mixture is heated for about 5 min, about 10min, about 15 min, about 20 min, about 25 min, about 30 min, about 35min, about 40 min, about 45 min, about 50 min, about 55 min, about 60min, about 70 min, about 80 min, about 90 min, about 120 min, about 150min, about 180 min, about 210 min, about 240 min, or more than 240 min.

In some embodiments the pdo's are present in the reaction mixture in aconcentration that is greater than the concentration of nucleic acidfragments in the mixture. In some embodiments, the pdo's are present ata concentration that is at least 10%, 20%, 30%, 40%, 60%, 60%, 70%, 80%,90%, 100% or more than 100% greater than the concentration of nucleicacid fragments in the mixture. The pdo's can be present at concentrationthat is at least 10-fold, 100-fold, 1000-fold, or 10000-fold greaterthan the concentration of nucleic acid fragments in the mixture. Thepdo's can be present at a final concentration of 0.1 uM, 0.5 uM, 1 uM,10 uM or greater. In some embodiments the ligase is present in thereaction mixture at a saturating amount.

The reaction mixture can additionally comprise a high molecular weightinert molecule, e.g., PEG of MW 4000, 6000, or 8000. The inert moleculecan be present in an amount that is about 0.5%, 1%, 2%, 3%, 4%, 5%,7.5%, 10%, 12.5%, 15%, 17.5%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, orgreater than 50% weight/volume. In some embodiments, the inert moleculeis present in an amount that is about 0.5-2%, about 1-5%, about 2-15%,about 10-20%, about 15-30%, about 20-50%, or more than 50%weight/volume.

The reaction mixture, in which ligation occurs, can comprise a pH in arange of about pH 1 to pH14. In some embodiments, the reaction mixturein which ligation occurs comprises a pH of at least pH 7.1, pH 7.2, pH7.3, pH 7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH 7.9, pH 8, pH 8.1, pH8.2, pH 8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH 8.8, pH 8.9, pH 9, pH9.5, pH 10, pH 10.5, pH 11, pH 11.5, pH 12, pH 12.5, pH 13, or greater.In some embodiments, the reaction mixture, in which ligation occurscomprises a pH of about neutral. In some embodiments, the reactionmixture in which ligation occurs comprises a pH of about pH 7.1 to aboutpH 9, about pH 7.5 to about pH 9, about pH 8 to about pH 10, or about pH7 to about pH 8. The pH of a reaction mixture in which ligation occurscan be less than pH 14, 13, 12, 11, 10, 9, 8, 7, 6.5, 6, 5.5, 5, 4.5, 4,3.5, 3, 2.5, 2, 1.5, or 1. The pH of a reaction mixture in whichligation occurs can be about pH 5 to about pH 6, about pH 4 to about pH5, about pH 3 to about pH 4, about pH 2 to about pH 3, or about pH 1 toabout pH2.

After sufficient time has occurred to effect ligation of adaptors to thess or ds nucleic acid molecules, unreacted adaptors can be removed byany means known in the art, e.g., filtration by molecular weight cutoff,size exclusion chromatography, use of a spin column, selectiveprecipitation with polyethylene glycol (PEG), selective precipitationwith PEG onto a silica or carboxylate matrix, alcohol precipitation,sodium acetate precipitation, PEG and salt precipitation, or highstringency washing.

In some embodiments, the method further comprises capturing the ligatednucleic acid fragment. Capturing of the ligated nucleic acid fragmentcan occur prior to extension or subsequent to extension. The ligatednucleic acid fragment can be captured onto a solid support. Capturingcan involve the formation of a complex comprising a capture moietyconjugated to a pdo and a capture reagent. In some embodiments, thecapture reagent is immobilized onto a solid support. In some embodimentsthe solid support comprises an excess of capture reagent as compared tothe amount of ligated nucleic acid comprising the capture moiety. Insome embodiments the solid support comprises 5-fold, 10-fold, or100-fold more available binding sites that the total number of ligatednucleic acid fragments comprising the capture moiety.

Extension

In some embodiments, a primer is hybridized to the ligated nucleic acidfragment via the pdo. The primer can comprise a portion or entirety ofan NGS adaptor sequence. Exemplary NGS adaptor sequences are describedherein. In some embodiments, the primer is extended to create a duplexcomprising the original nucleic acid fragment and the extended primer,wherein the extended primer comprises a reverse complement of theoriginal nucleic acid fragment and an NGS adaptor sequence at one end.In some embodiments the NGS adaptor is at the 5′ end. Exemplary NGSadaptor sequences are described herein. In some embodiments, the NGSadaptor sequence comprises a sequence that is at least 70% identical toa surface-bound oligonucleotide of an NGS platform. In some embodiments,the NGS adaptor sequence comprises a sequence that is at least 70%complementary to a surface-bound oligonucleotide of an NGS platform. Insome embodiments, the NGS adaptor sequence comprises a sequence that isat least 70% identical to a sequencing primer for use by an NGSplatform. In some embodiments, the NGS adaptor sequence comprises asequence that is at least 70% complementary to a sequencing primer foruse by an NGS platform. Extension can be effected by a proofreadingmesophilic or thermophilic DNA polymerase. Preferably, the polymerase isa thermophilic polymerase with 5′-3′ exonucleolytic/endonucleolytic (DNApolymerases I, II, III) or 3′-5′ exonucleolytic (family A or B DNApolymerases, DNA polymerase I, T4 DNA polymerase) activity. In someinstances, the polymerase can have no exonuclease activity (Taq). Insome cases, the polymerase effects linear amplification of theimmobilized ligated fragment, creating a plurality of copies of thereverse complement of the immobilized ligated fragment. In other casesonly one copy of the reverse complement is created. In some embodiments,the extended primer molecules are separated from the original nucleicacid template (e.g., by denaturation as described herein). The extendedprimer molecules are free in solution while the original nucleic acidtemplate molecules remain immobilized to the solid support. The extendedprimer molecules can be easily harvested, resulting in a nucleic acidlibrary preparation in which most of the library members comprise an NGSadaptor. At least 50%, 60%, 70%, 80%, 90%, more than 90%, orsubstantially all of the library members can comprise an NGS adaptor.

An exemplary workflow for preparing a nucleic acid library (e.g., ssDNAlibrary, dsDNA library) is outlined below.

FIG. 3 depicts an exemplary embodiment of the method for preparing anucleic acid library from nucleic acids (e.g., DNA or RNA) isolated froma biological sample (e.g., a blood, plasma, urine, stool, mucosalsample). The nucleic acids obtained can be fragmented by enzymatic ormechanical means to 100-1000, but preferably 100-500 bp fragments. Thenucleic acids can be fragmented in situ. Nucleic acids can be fragmentedfrom formalin-fixed paraffin-embedded (FFPE) tissues or circulating DNA.Nucleic acids can be isolated from FFPE and circulating by kits (Qiagen,Covaris). In some embodiments, the nucleic acids are DNA. In someembodiments, the nucleic acids are dsDNA. In some embodiments, dsDNA aredenatured to generate ssDNA. In some embodiments, the DNA is cDNAgenerated from RNA isolated from a biological sample from the samesamples using random primed reverse transcription (RNaseH+) to generaterandomly sized cDNA. In some embodiments, the nucleic acid is RNA.Fragmented DNA can be treated with a base excision repair enzyme (EndoVIII, formamidopyrimidine DNA glycosylase (FPG)) to excise damaged basesthat can interfere with polymerization. DNA can then be treated with aproof-reading polymerase (e.g. T4 DNA polymerase) to polish ends andreplace damaged nucleotides (e.g. abasic sites). In some embodiments,DNA is not treated with a proof-reading polymerase to polish ends andreplace damaged nucleotides.

In step 1, the nucleic acids (e.g., DNA or RNA) can be treated withheat-labile phosphatase to remove all phosphate groups from the nucleicacids. The reaction mixture can be heated to 80° C. for 10 min toinactivate the phosphatase and polymerase and denature double strandedDNA (dsDNA) to single strands.

In step 2, a chemically or enzymatically phosphorylated pdo containing a3′-end affinity tag (e.g. biotin) 12 to 50 bases in length can beligated to the fragmented single-strand nucleic acids at a finalconcentration of 0.5 uM or greater with saturating amount ofATP-dependent RNA ligase (T4 RNA ligase, but preferably thermophilicsuch as CircLigase, CircLigase II) in the presence of 10-20% (w/v)polyethylene glycol of average molecular weight 4000, 6000, or 8000. Thereaction can be incubated for 1 hr @ 60-70C The pdo can comprise thefollowing: (i) all, part or none of the sequence corresponding to asurface-bound oligonucleotide for Illumina flow cell cluster generation(ii) a 3′-end affinity group that is incapable of participating in theligation reaction that is linked to the oligonucleotide at a sufficientdistance (10 atoms or greater) to minimize steric hindrance of theinteraction between the affinity ligand and the bound receptor.

The pdo can be adenylated by any means known in the art. If anadenylated adaptor is used, in some embodiments the ATP-dependent RNAligase is not CircLigase or CircLigase II. In some embodiments, andATP-dependent RNA ligase is not required. The reaction can be purifiedby size to remove unreacted adaptor. This can be achieved through theuse of a microfiltration unit with a molecular size cutoff of 10K or 3K(e.g. microcon YM-10 or YM3, or nanosep omega). Alternatively, adaptorremoval can be achieved through passage through a size exclusiondesalting column (agarose, polyacrylamide) with a size exclusion cutoffof 10K or less, through the use of a spin column, through selectiveprecipitation with PEG, alcohol or salt, high stringency washing, ordenaturing gel electrophoresis.

In step 6 an oligonucleotide primer either fully complementary to theadaptor or partially complementary to the adaptor at its 3′-end, butfully possessing the sequence corresponding to the Illumina flow-celloligonucleotides, can then be used to create a reverse complement of thebound library using a proofreading mesophilic DNA polymerase.Preferably, a thermophilic polymerase with 5′-3′exonucleolytic/endonucleolytic (Family A DNA polymerase, e.g., DNApolymerase I) or 3′-5′ exonucleolytic (family B DNA polymerases, Vent,Phusion, Pfu and their variants) activity is used to permit linearamplification of the library.

In step 7 the recovered material can then be bound to an affinity resinor support capable of binding to the 3′-end affinity tag in batch mode.The recovered material can be put into a pre-rinsed support in a 0.2 mltube containing at least 10-fold excess and preferably 100-fold moreavailable binding sites that the total number of tagged adaptormolecules.

In step 8 the supernatant consisting of copies of the bound library canbe harvested and quantified.

FIG. 4 is a depiction of an exemplary workflow as described in FIG. 3for preparing an ssDNA library. In step 410 dsDNA is fragmented. In step420 dsDNA fragments are dephosphorylated and heat-denatured into singlestrands. In step 430 biotinylated pdo's comprising a primer-dockingsequence 431 are contacted with the nucleic acid fragments. In step 440the pdo's are ligated to the 3′ ends of the ssDNA fragments to createlibrary member precursors. In step 450 primers comprising sequencecomplementary to the pdo 451 and adaptor sequence 452 are hybridized instep 560 to the ssDNA via the pdos. In step 460 the hybridized primersare extended along the template ssDNA fragments to create duplexes. Theduplexes are immobilized onto a solid support (e.g., streptavidin coatedbeads). Heat denaturation releases the final library members intosolution while retaining the original ssDNA fragment on the bead.

Alternative Embodiments of ssDNA Library Preparation.

In another aspect, the disclosure provides a method of preparing a ssDNAlibrary, comprising denaturing dsDNA fragments into ssDNA, and ligatingadaptor sequences to both ends of the ssDNA molecules. Methods offragmenting dsDNA are described herein. Methods of denaturing dsDNAfragments are described herein.

The method can comprise ligating a first adaptor that comprises asequence that is at least 70% complementary or identical to a firstsurface-bound oligonucleotide. The first surface-bound oligonucleotidecan be an NGS platform-specific surface bound oligonucleotide. The firstadaptor can comprise a sequence complementary or identical to 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, 20, or more than 20 contiguous nucleotidesof the surface-bound oligonucleotide. The first adaptor can furthercomprise a sequence that is at least 70% complementary to a firstsequencing primer. In some embodiments the first adaptor is ligated to a3′ end of an ssDNA fragment using a method described herein or anymethod known in the art. In some embodiments, the ssDNA fragment lacks5′ phosphate groups. In particular embodiments, the first adaptor isligated to the 3′ end of the ssDNA fragment by an ATP-dependent ligase.In other embodiments, the first adaptor comprises a 3′ terminal blockinggroup. Generally, the 3′ terminal blocking group will prevent theformation of a covalent bond between the 3′ terminal base and anothernucleotide. In some embodiments, the 3′ terminal blocking group isdideoxy-dNTP or biotin. The first adaptor can be 5′ adenylated. In someembodiments, the first adaptor is ligated to a 3′ end of an ssDNAfragment by an RNA ligase as described herein. The RNA ligase can betruncated or mutated RNA ligase 2 from T4 or Mth. The method can furthercomprises ligating a second adaptor sequence to a 5′ end of the ssDNAfragment. The second adaptor sequence can be distinct from the firstadaptor sequence. The second adaptor sequence can comprise a sequencethat is at least 70% complementary to a second surface-boundoligonucleotide. The second surface-bound oligonucleotide can be an NGSplatform-specific surface bound oligonucleotide. The second adaptor cancomprise a sequence complementary or identical to 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 20, or more than 20 contiguous nucleotides of thesurface-bound oligonucleotide. The second adaptor can further comprise asequence that is at least 70% complementary to a second sequencingprimer. In some embodiments the second adaptor is ligated to the ssDNAfragment using RNA ligase, e.g., CircLigase as described herein. In someembodiments, the first and second adaptor are both at least 70%complementary to the first and second surface-bound oligonucleotides. Inother embodiments, the first and second adaptor are both at least 70%identical to the first and second surface-bound oligonucleotides.

The ssDNA library produced using methods described herein can be usedfor whole genome sequencing or targeted sequencing. In some embodiments,the ssDNA library produced using methods described herein are enrichedfor target polynucleotides of interest prior to sequencing.

Target Enrichment

In another aspect, the disclosure provides a method for preparing atarget-enriched nucleic acid library. The method can involve hybridizinga target-selective oligonucleotide (TSO) to a single stranded DNA(ssDNA) fragment to create a hybridization product, and amplifying thehybridization product in a single round of amplification to create anextension strand.

The method of target enrichment can be as described in US. PatentApplication Pub. No. 20120157322, hereby incorporated by reference.

The hybridizing and amplifying can occur in a reaction mixture. The term“reaction mixture” as used herein generally refers to a mixture ofcomponents necessary to amplify at least one amplicon from nucleic acidtemplate molecules. The mixture may comprise nucleotides (dNTPs), apolymerase and a target-selective oligonucleotide. In some embodiments,the mixture comprises a plurality of target-selective oligonucleotides.The mixture may further comprise a Tris buffer, a monovalent salt, andMg2+. The concentration of each component is well known in the art andcan be further optimized by an ordinary skilled artisan. The reactionmixture can also comprise additives including, but not limited to,non-specific background/blocking nucleic acids (e.g., salmon sperm DNA),biopreservatives (e.g. sodium azide), PCR enhancers (e.g. Betaine,Trehalose, etc.), and inhibitors (e.g. RNAse inhibitors). In someembodiments, a nucleic acid sample (e.g., a sample comprising an ssDNAfragment) is admixed with the reaction mixture. Accordingly, in someembodiments the reaction mixture further comprises a nucleic acidsample.

The ssDNA fragment can be a member of an ssDNA library. The ssDNAlibrary can be prepared using a method as described herein. The ssDNAfragment can comprise a first single-stranded adaptor sequence locatedat a first end but not at a second end. In some embodiments, the firstend is a 5′ end. In some embodiments, the TSO comprises a secondsingle-stranded adaptor sequence located at a first end but not a secondend. The first end can be a 5′ end. In some embodiments, the firstadaptor sequence comprises a sequence that is at least 70% identical toa first surface-bound oligonucleotide. In some embodiments, the firstadaptor sequence comprises a sequence that is at least 70% identical toa sequencing primer. In some embodiments the first adaptor furthercomprises a barcode sequence. In some embodiments, the second adaptorcomprises a sequence that is at least 70% identical to a secondsurface-bound oligonucleotide. In some embodiments, the second adaptorcomprises a sequence that is at least 70% identical to a sequencingprimer

The target-selective oligonucleotide (tso) can be designed to at leastpartially hybridize to a target polynucleotide of interest. In someembodiments, the tso is designed to selectively hybridize to the targetpolynucleotide. The tso can be at least about 70%, 75%, 80%, 85%, 90%,95%, or more than 95% complementary to a sequence in the targetpolynucleotide. In some embodiments, the tso is 100% complementary to asequence in the target polynucleotide. The hybridization can result in atso/target duplex with a Tm. The Tm of the tso/target duplex can bebetween 0-100° C., between 20-90° C., between 40-80° C., between 50-70°C., or between 55-65° C. The tso generally is sufficiently long to primethe synthesis of extension products in the presence of a polymerase. Theexact length and composition of a tso can depend on many factors,including temperature of the annealing reaction, source and compositionof the primer, and ratio of primer:probe concentration. The tso can be,for example, 8-50, 10-40, or 12-24 nucleotides in length.

Amplification

The method can comprise amplification of the target in the reactionmixture. The amplification can be primed by a tso in a tso/targetduplex. In some embodiments amplification is carried out utilizing anucleic acid polymerase. The nucleic acid polymerase can be a DNApolymerase. In particular embodiments, the DNA polymerase is athermostable DNA polymerase. The polymerase can be a member of A or Bfamily DNA proofreading polymerases (Vent, Pfu, Phusion, and theirvariants), a DNA polymerase holoenzyme (DNA pol III holoenzyme), a Taqpolymerase, or a combination thereof.

Amplification can be carried out as an automated process wherein thereaction mixture comprising template DNA is cycled through a denaturingstep, a primer annealing step, and a synthesis step, whereby cleavageand displacement occurs simultaneously with primer-dependent templateextension. The automated process may be carried out using a PCR thermalcycler. Commercially available thermal cycler systems include systemsfrom Bio-Rad Laboratories, Life technologies, Perkin-Elmer, amongothers. In some embodiments, one cycle of amplification is performed.

Amplification of the tso/target duplex can result in an extensionproduct comprising the original ssDNA fragment comprising the targetsequence, and an extended strand comprising the second adaptor sequence,the tso, a reverse complement of the target sequence, and a reversecomplement of the first adaptor sequence. If the first adaptor sequenceof the original ssDNA fragment was 70% or more identical to a firstsurface-bound oligonucleotide, then the extended strand would comprise afirst adaptor sequence that is 70% or more complementary to the firstsurface-bound oligonucleotide, and thereby would be hybridizable to thefirst surface-bound oligonucleotide. The extended strands, can comprisethe target-enriched library.

The extension products in the reaction mixture can be denatured. Thedenatured extension products can be contacted with a surface immobilizedthereon at least a first surface-bound oligonucleotide. In someembodiments, the extended strand is captured by the first surface-boundoligonucleotide, which can anneal to the first adaptor sequence on theextended strand.

The first surface-bound oligonucleotide can prime the extension of thecaptured extended strand. In some embodiments, extension of the capturedextended strand results in a captured extension product. The capturedextension product comprises the first surface bound oligonucleotide, thetarget sequence, and a second adaptor sequence that is at least 70% ormore complementary to a second surface-bound oligonucleotide.

In some embodiments, the captured extension product hybridizes to thesecond surface-bound oligonucleotide, forming a bridge. In someembodiments, the bridge is amplified by bridge PCR. Bridge PCR methodscan be carried out using methods known to the art.

Kits for Library Preparation and Target Enrichment

Also provided are kits for practicing a method of library preparation asdescribed herein or target-enrichment as described herein.

In one aspect, the kit comprises reagents for repairing and chemicaldenaturation of dsDNA. In one embodiment, the kit comprises reagents forpurification of single-stranded DNA. In one embodiment, the kitcomprises enzymes for excision of damaged bases. In some embodiments,the kit comprises a phosphatase. In one embodiment, the kit comprises akinase. In some embodiments, the kit comprises a terminal transferaseand dideoxynucleotides to block the 3′-end of DNA fragments.

In one aspect, the disclosure provides kits for preparing a ssDNAlibrary. In one embodiment, the kit comprises a pdo as described herein.In some embodiments, the kit comprises instructions, e.g., instructionsfor ligating a pdo to an ssDNA fragment. The kit can further comprise aligase. The ligase can be an Rnl 1 or Rnl 2 family ligase, as describedherein. The kit can further comprise a primer which can hybridize to thepdo. Primers hybridizable to the pdo are described herein. In someembodiments, the kit provides a solid support, e.g., a bead immobilizedthereon a capture reagent. In some embodiments, the kit provides apolymerase for conducting an extension reaction. In some embodiments,the kit provides dNTPs for conducting an extension reaction.

In another embodiment, the kit comprises a first adaptor oligonucleotidethat comprises sequence that is at least 70% complementary to a firstsupport-bound oligonucleotide coupled to a sequencing platform, a secondadaptor oligonucleotide that comprises a sequence that is distinct fromthe first adaptor, an RNA ligase, and instructions for use, e.g.,instructions for practicing a method of the disclosure. In someembodiments, the first adaptor comprises a 3′ terminal blocking groupthat prevents the formation of a covalent bond between the 3′ terminalbase and another nucleotide. 3′ terminal blocking groups are describedherein. In some embodiments, the first is 5′ adenylated. In someembodiments, the first adaptor comprises a sequence that is at least 70%complementary to a sequencing primer. In some embodiments, the secondadaptor comprises a sequence that is at least 70% complementary to asequencing primer. In some embodiments, the second adaptor comprises asequence that is at least 70% complementary to a second support-boundoligonucleotide coupled to a sequencing platform.

The disclosure provides kits for preparing a target-enriched DNAlibrary. In some embodiments, the kit comprises a pdo, a ligase, aprimer which can hybridize to the pdo, a solid support comprising acapture reagent, a polymerase, dNTPs, or any combination thereof. Insome embodiments the kit further comprises a tso. The tso can beimmobilized on a solid support coupled for sequencing on an NGSplatform, as described in US Patent Application Pub No. 20120157322,hereby incorporated by reference.

In some embodiments, kits of the disclosure include a packagingmaterial. As used herein, the term “packaging material” can refer to aphysical structure housing the components of the kit. The packagingmaterial can maintain sterility of the kit components, and can be madeof material commonly used for such purposes (e.g., paper, corrugatedfiber, glass, plastic, foil, ampules, etc.). Kits can also include abuffering agent, a preservative, or a protein/nucleic acid stabilizingagent.

Sequencing

In some embodiments the target-enriched libraries are sequenced usingany methods known in the art or as described herein. Sequencing canreveal the presence of mutations in one or more cancer-related genes inthe set. In some embodiments a subset of 2, 3, 4 genes harboring themutations are selected for further monitoring by assessment of cell-freeDNA in a fluid sample isolated from the subject at later time points. Insome embodiments a subset of no more than 4 genes harboring themutations are selected for further monitoring by assessment of cell-freeDNA in a fluid sample isolated from the subject at later time points.

Assessment of Cell-Free DNA Over Time

In some embodiments, assessment of cell-free DNA comprises detectionand/or measurement of alleles of the subset of genes, as shown in FIG.5. FIG. 5 depicts tumor DNA 601 entering the bloodstream of a subject.Detection of the alleles can be by any means known in the art or asdescribed herein. The detection can be by methods as described in U.S.Pat. No. 5,538,848 (e.g., using a Taqman assay) or as described herein.Cell-free DNA sample can include plasma, serum, sputum, saliva, urine,cerebral spinal fluid, mucosal secretions, amniotic fluid, or sweat.

Accordingly, the present disclosure provides methods and kits for thesensitive detection of a mutation in a target polynucleotide. In someaspects, the methods and kits of the disclosure can be used for thediscrimination of alleles in a target polynucleotide. For example, thedisclosure provides methods and kits for the detection of mutant allelesin a background of high wild-type allelic ratio. For another example,the disclosure provides methods and kits for the detection of multiplealleles. In some embodiments, detection of an allele is enabled byrelease or activation of a detectable signal if the interrogated alleleis present.

Methods for Allele Detection

In some aspects, one or more methods of allele detection as describedherein relate to the ability of an oligonucleotide primer to bind to atarget polynucleotide region suspected of harboring the mutation. Theoligonucleotide primer can partially overlay a locus of the suspectedmutation. In some embodiments the oligonucleotide primer completelyoverlays the mutation. Accordingly, in some embodiments the mutation issmall enough to be encompassed by an oligonucleotide primer. Themutation can be a single nucleotide polymorphism (SNP). The mutation canalso comprise multiple nucleotide polymorphisms (e.g., double mutationor triple mutation). The mutation can be an insertion of one or morenucleotides. The mutation can be an insertion of 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 50, 100, 500, 1000,10000, 100000, 1000000 nucleotides. The mutation can be an insertion of1-5, 2-10, 5-15, or 10-20 nucleotides. In some embodiments, the mutationis a deletion of one or more nucleotides. The mutation can be a deletionof 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20nucleotides. The mutation can be a deletion of 1-5, 2-10, 5-15, or 10-20nucleotides. The mutation can be an inversion of two or morenucleotides. In some embodiments, 2, 3, 4, 5, or more nucleotides areinverted. In some embodiments, the mutation is a copy number variation(e.g., a copy number variation of a SNP or wild-type allele).

In one aspect, the disclosure provides a method of detecting a mutationin a target polynucleotide region, comprising the steps of: (a)contacting a nucleic acid sample with a reaction mixture for alleledetection, wherein the reaction mixture for allele detection comprisesan oligonucleotide primer capable of hybridizing to the targetpolynucleotide region, wherein the oligonucleotide primer comprises aprobe binding region and a template binding region that at leastpartially overlays a locus suspected of harboring the mutation and iscapable of allele-specific extension by a polymerase; (b) extending theoligonucleotide primer to form an extension product; and (c) detectingthe extension product, whereby the detecting the extension productindicates the presence of the mutation.

Primers for Allele Detection

The oligonucleotide primer (e.g., a forward primer) can be designed toat least partially hybridize to a target polynucleotide suspected ofharboring a mutation. In some embodiments, the template binding regionof the forward primer is designed to selectively hybridize to the targetpolynucleotide. The hybridization can result in a forwardprimer/template duplex with a Tm. The Tm of the primer/template duplexcan be between 0-100° C., between 20-90° C., between 40-80° C., between50-70° C., or between 55-65° C. The template binding region of theforward primer can be 8-15, 8-30, 8-50, 10-40, 5-100, or 12-24nucleotides in length. The template binding region of the forward primercan be designed to at least partially overlay a particular locussuspected of harboring a mutation. The template binding region of theforward primer can, for example, overlay about at least 0.5%, 1%, 2%,3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 20%, 20%, 40%, 50%, 60%, 70%, 80%, 90%,or 100% of the locus suspected of harboring the mutation. The templatebinding region of the forward primer can overlay at least about 0.5-2%,1-10%, 5-20%, 10-50%, 30-70%, 50-80%, 60-90%, or 80-100% of the locussuspected of harboring the mutation. The template binding region can belocated at a 3′ region of the forward primer. In some embodiments, theregion of the template binding region that overlays the locus is a 3′terminal region. In some embodiments, the 3′ terminal region thatoverlays the mutation locus comprises 1, 2, 3, 4, 5, or more than 5bases of the 3′-end of the template binding region. In some embodiments,the 3′ terminal base of the forward primer overlays the locus. In someembodiments, the 3′ terminal region of the forward primer iscomplementary to the interrogated allele. The 3′ terminal base of theforward primer may not complementary to the interrogated allele. In someembodiments, one or more mismatches is introduced into the 3′-regionadjacent to the 3′-terminal base (e.g., n-1, n-2, n-3, etc.). Thesemismatches can be nucleotides or modified nucleotides that increase ordecrease the impact of this mismatch on primer extension.

The template binding region can at least partially overlay with a locusthat is suspected of having a copy number variation. In someembodiments, the template binding region of the forward primer canoverlay at least about 0.5-2%, 1-10%, 5-20%, 10-50%, 30-70%, 50-80%,60-90%, or 80-100% of the locus suspected of having a copy numbervariation.

The 3′ terminal region of the forward primer can comprise nucleotideslinked by phosphorothioates linkages. In some embodiments, at least 2,3, 4, 5, or more nucleotides at the 3′ terminal region of the forwardprimer are linked by phosphorothioates linkages.

A forward primer can further comprise a probe-binding region. Generally,the probe-binding region of the forward primer enables use of a reporterprobe that is template independent. The probe-binding region cancomprise a unique sequence or barcode that does not hybridize to thetemplate nucleic acid. The probe-binding region can, for example, bedesigned to avoid significant sequence similarity or complementarity toknown genomic sequences of an organism of interest. Such uniquesequences can be randomly generated, e.g., by a computer readablemedium, and selected by BLASTing against known nucleotide databases suchas, e.g., EMBL, GenBank, or DDBJ. The barcode sequence can also bedesigned to avoid secondary structure. Tools for probe design are knownin the art, and include, e.g., mFold, Primer Express. The probe-bindingregion can be 5-50, 6-40, or 7-30 nucleotides in length. The probebinding region can correspond to a region of a surface-bindingoligonucleotide for bridge amplification and/or generating sequencinginformation. The probe-binding region can be 1-100, 1-20, 3-15, or 6-8nucleotides away from the template binding region of the forward primer.The probe-binding region can be located 5′ of the template bindingregion. In some embodiments, the probe is not a low Tm probe. In someembodiments, the probe is a low Tm probe comprising: a detectablemoiety; a quencher moiety; and a melting temperature (Tm) below 50° C.In some embodiments, the low Tm probe has a length of 8-30 nucleotides.In some embodiments, the detectable moiety is quenched at a temperatureof 55° C. or higher. In some embodiments, the low Tm probe does nothybridize to a complementary template nucleic acid at an ambienttemperature above 55° C. In some embodiments, the quencher moietyquenches the detectable moiety if the probe is not hybridized to atemplate strand. In some embodiments, the Tm of the low Tm probe isbetween 30-45° C. In some embodiments, the fluorophore moiety andquencher moiety low Tm probe are spaced at least seven nucleotidesapart. In some embodiments, the low Tm probe comprises a nucleotide witha Tm enhancing base. In some embodiments the nucleotide with a Tmenhancing base is a Superbase, locked nucleotide, or bridge nucleotide.In some embodiments, the detectable moiety of the low Tm probe comprisesa fluorophore.

In some embodiments, the method further comprises contacting the nucleicacid sample with a reverse primer. The reverse primer can be anoligonucleotide primer that corresponds to a region of template nucleicacid that is downstream of the forward primer. In some embodiments, thereverse primer is downstream of the interrogated allele. The reverseprimer can bind to a reverse complement strand of the targetpolynucleotide. A forward/reverse primer pair can span a target regionsuspected of harboring a mutation. In some embodiments, the reverseprimer can be an oligonucleotide that is the reverse complement of a pdoligated to the 3′-end of a plurality of DNA fragments. In someembodiments, the target region is 14-1000, 20-800, 40-600, 50-500,70-300, 90-200, or 100-150 nucleotides long.

Primers or other oligonucleotides used in the present disclosure mayfurther comprise a barcode sequence. Barcode sequences are describedherein. In some embodiments, a barcode sequence encodes informationrelating to the identity of an interrogated allele, an individualmolecule, identity of a target polynucleotide or genomic locus, identityof a sample, a subject, or any combination thereof. A barcode sequencecan be a portion of a primer, a reporter probe, or both. A barcodesequence may be at the 5′-end or 3′-end of an oligonucleotide, or may belocated in any region of the oligonucleotide. A barcode sequencegenerally is not part of a template sequence. Barcode sequences may varywidely in size and composition; the following references provideguidance for selecting sets of barcode sequences appropriate forparticular embodiments: Brenner, U.S. Pat. No. 5,635,400; Brenner et al,Proc. Natl. Acad. Sci., 97: 1665-1670 (2000); Shoemaker et al, NatureGenetics, 14: 450-456 (1996); Morris et al, European patent publication0799897A1; Wallace, U.S. Pat. No. 5,981,179. A barcode sequence may havea length of about 4 to 36 nucleotides, about 6 to 30 nucleotides, orabout 8 to 20 nucleotides.

Primers used in the present disclosure are generally sufficiently longto prime the synthesis of extension products in the presence of theagent for polymerization. The exact length and composition of a primercan depend on many factors, including temperature of the annealingreaction, source and composition of the primer, and ratio ofprimer:probe concentration. The primer length can be, for example, about5-100, 10-50, or 20-30 nucleotides, although a primer may contain moreor fewer nucleotides.

Reporter Probes

In some embodiments, the reaction mixture further comprises a reporterprobe. Generally, the reporter probe of the present disclosure isdesigned to produce a detectable signal indicating the presence of theinterrogated allele.

The reporter probe can comprise a detectable moiety and a quenchermoiety. The detectable moiety can be a dye. The dye can be a fluorescentdye, e.g., a fluorophore. The fluorescent dye can be a derivatized dyefor attachment to the terminal 3′ carbon or terminal 5′ carbon of theprobe via a linking moiety. The dye can be derivatized for attachment tothe terminal 5′ carbon of the probe via a linking moiety. Quenching caninvolve a transfer of energy between the fluorophore and the quencher.The emission spectrum of the fluorophore and the absorption spectrum ofthe quencher can overlap. When the probe is intact, the fluorescentsignal from the detectable moiety can be substantially suppressed by thequencher. Cleavage of the reporter probe, e.g., by hydrolysis, canseparate the detectable moiety from the quencher moiety. In someembodiments, hybridization to a target sequence is sufficient to effectsufficient separation of the fluorophore from the quencher. Separationof the fluorophore from the quencher can be determined by the number ofhelical turns that exist between the two moieties upon probe binding.The separation can enable the fluorescent moiety to produce a detectablefluorescent signal.

The reporter probes may be designed according to Livak et al.,“Oligonucleotides with fluorescent dyes at opposite ends provide aquenched probe system useful for detecting PCR product and nucleic acidhybridization,” PCR Methods Appl. 1995 4: 357-362.

Reporter-quencher moiety pairs for particular probes can be selectedaccording to, e.g., Pesce et at, editors, Fluorescence Spectroscopy(Marcel Dekker, New York, 1971); White et at, Fluorescence Analysis: APractical Approach (Marcel Dekker, New York, 1970. Exemplary fluorescentand chromogenic molecules that may be used in reporter-quencher pairs,are described in, e.g. Berlman, Handbook of Fluorescence Sprectra ofAromatic Molecules, 2nd Edition (Academic Press, New York, 1971);Griffiths, Colour and Constitution of Organic Molecules (Academic Press,New York, 1976); Bishop, editor, Indicators (Pergamon Press, Oxford,1972); Haugland, Handbook of Fluorescent Probes and Research Chemicals(Molecular Probes, Eugene, 1992); Pringsheim, Fluorescence andPhosphorescence (Interscience Publishers, New York, 1949).

A wide variety of reactive fluorescent reporter dyes can be used so longas they are quenched by a quencher dye of the disclosure. Thefluorophore can be an aromatic or heteroaromatic compound. Thefluorophore can be, for example, a pyrene, anthracene, naphthalene,acridine, stilbene, benzoxaazole, indole, benzindole, oxazole, thiazole,benzothiazole, canine, carbocyanine, salicylate, anthranilate, xanthenesdye, coumarin. Exemplary xanthene dyes include, e.g., fluorescein andrhodamine dyes. Exemplary fluorescein and rhodamine dyes include, butare not limited to 6-carboxyfluorescein (FAM),2′7′-dimethoxy-4′5′-dichloro-6-carboxyfluorescein (JOE),tetrachlorofluorescein (TET), 6-carboxyrhodamine (R6G),N,N,N;N′-tetramethyl-6-carboxyrhodamine (TAMRA), 6-carboxy-X-rhodamine(ROX). Suitable fluorescent reporters also include the naphthylaminedyes that have an amino group in the alpha or beta position. Forexample, naphthylamino compounds include1-dimethylaminonaphthyl-5-sulfonate, 1-anilino-8-naphthalene sulfonateand 2-p-toluidinyl-6-naphthalene sulfonate, 5-(2′-aminoethyl)aminonaphthalene-1-sulfonic acid (EDANS). Exemplary coumarins include,e.g., 3-phenyl-7-isocyanatocoumarin; acridines, such as9-isothiocyanatoacridine and acridine orange;N-(p-(2-benzoxazolyl)phenyl) maleimide; cyanines, such as, e.g.,indodicarbocyanine 3 (Cy3), indodicarbocyanine 5 (Cy5),indodicarbocyanine 5.5 (Cy5.5),3-(-carboxy-pentyl)-3′-ethyl-5,5′-dimethyloxacarbocyanine (CyA); 1H, 5H,11H, 15H-Xantheno[2,3,4-ij:5,6,7-i′j′]diquinolizin-18-ium, 9-[2 (or4)-[[[6-[2,5-dioxo-1-pyrrolidinyl)oxy]-6-oxohexyl]amino]sulfonyl]-4 (or2)-sulfophenyl]-2,3,6,7,12,13,16,17-octahydro-inner salt (TR or TexasRed); or BODIPY™ dyes. Exemplary fluorescent and quencher moieties aredescribed in, e.g., WO/2005/049849, hereby incorporated by reference.

As is known in the art, suitable quenchers are selected according to thefluorescer. Exemplary reporters and quenchers are further described inAnderson et al, U.S. Pat. No. 7,601,821, hereby incorporated byreference.

Quenchers are also available from various commercial sources. Exemplarycommercially available quenchers include, e.g., Black Hole Quenchers®from Biosearch Technologies and Iowa Black® or ZEN quenchers fromIntegrated DNA Technologies, Inc.

In some embodiments, the reporter probe comprises two quencher moieties.Exemplary probes comprising two quencher moieties include the Zen probesfrom Integrated DNA Technologies. Such probes comprise an internalquencher moiety that is located about 9 bases away from the detectablemoiety, and generally reduce background signal associated withtraditional reporter/quencher probes.

Detectable moieties and quencher moieties can be derivatized forcovalent attachment to oligonucleotides via common reactive groups orlinking moieties. Methods for derivatization of detectable and quenchermoieties are described in, e.g., Ullman et al, U.S. Pat. No. 3,996,345;Khanna et al, U.S. Pat. No. 4,351,760; Eckstein, editor,Oligonucleotides and Analogues: A Practical Approach (IRL Press, Oxford,1991); Zuckerman et al, Nucleic Acids Research, 15: 5305-5321 (1987) (3′thiol group on oligonucleotide); Sharma et al, Nucleic Acids Research,19:3019 (1991) (3′ sulfhydryl); Giusti et al, PCR Methods andApplications, 2:223-227 (1993) and Fung et al, U.S. Pat. No. 4,757,141(5′ phosphoamino group via Aminolink™ II available from AppliedBiosystems, Foster City, Calif.); Stabinsky, U.S. Pat. No. 4,739,044 (3′aminoalkylphosphoryl group); Agrawal et al, Tetrahedron Letters,31:1543-1546 (1990) (attachment via phosphoramidate linkages); Sproat etal, Nucleic Acids Research, 15:4837 (1987)(5′ mercapto group); Nelson etal, Nucleic Acids Research, 17:7187-7194 (1989) (3′ amino group), all ofwhich are hereby incorporated by reference).

In some embodiments, commercially available linking moieties can beattached to an oligonucleotide during synthesis, e.g. linking moietiesavailable through Clontech Laboratories (Palo Alto, Calif.). By way ofexample only, rhodamine and fluorescein dyes can be derivatized with aphosphoramidite moiety for attachment to a 5′ hydroxyl of anoligonucleotide (see, e.g., Woo et al, U.S. Pat. No. 5,231,191; andHobbs, Jr. U.S. Pat. No. 4,997,928, all of which are hereby incorporatedby reference).

As temperature decreases, there can be an increase in the fractionalbinding of the probe with its complementary sequence, which can resultin an increase in the fluorescence signal. Differences in the amplitudeof the fluorescence signal at a fractional binding of 1 can reflect thedifferences in the relative orientation of the fluorophore and quencherupon hybridization. FIG. 48, upper panel shows the orientation of F andQ in trans/staggered conformation, and lower panel shows the orientationof F and Q in cis/eclipsed conformation. Thus, quenching can be relatedboth to the distance between the quencher and reporter and the relativespatial position as well.

In some embodiments, the detectable moiety produces a non-fluorescentsignal. For example, any probe for which hydrolysis of the probe resultsin a detectable separation of a signal moiety from the detectionprobe-amplicon complex may be used. For example, release of the signalmoiety may be detected electronically (e.g., as an electrode surfacecharge perturbation when a signal moiety is released from the detectionprobe/amplicon complex), by quantum dot sensing, by luminescence, orchemically (e.g., by a change in pH in a solution as a signal moiety isreleased into solution). Likewise, any probe that binds to aprobe-binding region and for which a change in signal can be detectedupon separation of a detectable moiety from a quencher moiety may beused. For example, molecular beacon probes, MGB probes, or other probesare contemplated for use in the disclosure. Molecular beacon probes aredescribed in, e.g., U.S. Pat. Nos. 5,925,517 and 6,103,406, herebyincorporated by reference. MGB probes are described in, e.g., U.S. Pat.No. 7,381,818, hereby incorporated by reference.

The reporter probe can be designed to selectively hybridize to aprobe-binding region of a primer as described herein. Accordingly, insome embodiments the reporter probe comprises a sequence that iscomplementary to at least a portion of the probe-binding region. Thereporter probe can be 5-50, 6-40, or 7-30 nucleotides in length. Thehybridization can result in a probe/primer duplex with a Tm. The Tm ofthe probe/primer duplex can be higher than the Tm of the primer/templateduplex. The Tm of the probe/primer duplex can be 1, 2, 3, 4, 5, 6, 7, 89, 10, or more than 10° C. than the Tm of the primer/template duplex.

Alternatively, the Tm of the probe/primer duplex can be lower than theTm of the primer/template duplex.

In some embodiments, the reporter probe selectively hybridizes to asequence in the probe-binding region that is at least 2, 3, 4, 5, 6, 7,8, 9, 10, 15, or 20 nucleotides apart from the template binding regionof the primer.

The reporter probe can be present at a concentration that is higher thanthe concentration of the forward primer. The reporter probe can forexample be present in a concentration that is, e.g., 1-10 fold or 1-5fold higher than the concentration of the forward primer. The reporterprobe can be present in a concentration that results in at least 50%, atleast 60%, at least 70%, at least 80%, at least 90%, at least 95%, atleast 96%, at least 97%, at least 98%, at least 99%, or about 100% ofthe forward primers occupied by the probe.

The primers and probes of the disclosure may be prepared by any suitablemethod. Methods for preparing oligonucleotides of specific sequence areknown in the art, and include, for example, cloning and restriction ofappropriate sequences and direct chemical synthesis. Chemical synthesismethods may include, for example, the phosphotriester method describedby Narang et al., 1979, Methods in Enzymology 68:90, the phosphodiestermethod disclosed by Brown et al., 1979, Methods in Enzymology 68:109,the diethylphosphoramidate method disclosed in Beaucage et al., 1981,Tetrahedron Letters 22:1859, and the solid support method disclosed inU.S. Pat. No. 4,458,066, all of which publications are herebyincorporated by reference.

In some embodiments, a forward primer comprising a template bindingregion and a probe-binding region can be prepared using two differentoligonucleotides corresponding to the template binding region and probebinding region, respectively. The two oligonucleotides can be ligatedenzymatically. Ligation can be by an RNA ligase. The RNA ligase can bean ATP dependent ligase. The RNA ligase can be an Rnl 1 family ligase.Generally, Rnl 1 family ligases can repair single-stranded breaks intRNA. Exemplary Rnl 1 family ligases include, e.g., T4 RNA ligase,thermostable RNA ligase 1 from Thermus scitoductus bacteriophage TS2126(CircLigase), or CircLigase II. Generally, Rnl 2 family ligases can sealnicks in duplex RNAs. Exemplary Rnl 2 family ligases include, e.g., T4RNA ligase 2. The RNA ligase can be an Archaeal RNA ligase, e.g., anarchaeal RNA ligase from the thermophilic archaeon Methanobacteriumthermoautotrophicum (MthRnl). Ligation can also be effected by use of asplint oligonucleotide that spans the two oligonucleotides correspondingto the template binding and probe binding regions, respectively. In someembodiments, ligation using a splint oligonucleotide can comprise use ofa T4 DNA ligase. Alternatively, ligation can be mediated by anATP-independent ligase. Exemplary ATP-independent ligases include, e.g.,RNA 3′-Phosphate Cyclase (RtcA), RNA ligase RtcB, or manufacturedvariants thereof. In some embodiments, ligation is performed indirectlythrough a two-step process, in which a template binding region isadenylated (e.g., adenylated chemically during synthesis orenzymatically using a ligase), and the adenylated template bindingsequence is conjugated to the probe binding region.

Ligation can also be performed with “click chemistry.” Click chemistryis a concept that involves linking smaller subunits with simplechemistry. Smaller subunits can refer to small building blocks of largermolecules such as DNA bases, RNA nucleotides, linear or circularized DNAor RNA oligonucleotides. (3+2) cycloadditions between azide and alkynegroups which results in the formation of 1,2,3-triazole rings (e.g.,copper-catalysed alkyne-azide coupling reaction) are generallyconsidered typical click chemistry reactions. Other chemical ligationmethods include the use of cyanogen bromide,phosphorothioate-iodoacetyl, and native ligation techniques where aC-terminal α-thioester is reacted in a chemoselective manner with anunprotected peptide containing an N-terminal Cys residue).

Ligation can be performed in a reaction mixture comprising a pH range ofabout pH 1-pH14. In some embodiments, the reaction mixture, in whichligation occurs comprises a pH of at least pH 7.1, pH 7.2, pH 7.3, pH7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH 7.9, pH 8, pH 8.1, pH 8.2, pH8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH 8.8, pH 8.9, pH 9, pH 9.5, pH10, pH 10.5, pH 11, pH 11.5, pH 12, pH 12.5, pH 13, or greater. In someembodiments, the reaction mixture, in which ligation occurs comprises aneutral pH (pH 7.0). In some embodiments, the reaction mixture, in whichligation occurs comprises a pH of about pH 7.1 to about pH 9, about pH7.5 to about pH 9, about pH 8 to about pH 10, or about pH 7 to about pH8. The pH of a reaction mixture in which ligation occurs can be lessthan pH 14, 13, 12, 11, 10, 9, 8, 7, 6.5, 6, 5.5, 5, 4.5, 4, 3.5, 3,2.5, 2, 1.5, or 1. The pH of a reaction mixture in which ligation occurscan be about pH 5 to about pH 6, about pH 4 to about pH 5, about pH 3 toabout pH 4, about pH 2 to about pH 3, or about pH 1 to about pH2.

Primers and/or reporter probes can also be obtained from commercialsources such as Operon Technologies, Amersham Pharmacia Biotech, Sigma,IDT Technologies, and Life Technologies. The primers can have anidentical melting temperature. The lengths of the primers can beextended or shortened at the 5′ end or the 3′ end to produce primerswith desired melting temperatures. Also, the annealing position of eachprimer pair can be designed such that the sequence and, length of theprimer pairs yield the desired melting temperature. The simplestequation for determining the melting temperature of primers smaller than25 base pairs is the Wallace Rule (Td=2(A+T)+4(G+C)). Computer programscan also be used to design primers, including but not limited to ArrayDesigner Software (Arrayit Inc.), Oligonucleotide Probe Sequence DesignSoftware for Genetic Analysis (Olympus Optical Co.), NetPrimer, andDNAsis from Hitachi Software Engineering. The Tm (melting or annealingtemperature) of each primer can be calculated using software programssuch as Oligo Design, available from Invitrogen Corp.

The annealing temperature of the primers can be recalculated andincreased after any cycle of amplification, including but not limited tocycle 1, 2, 3, 4, 5, cycles 6-10, cycles 10-15, cycles 15-20, cycles20-25, cycles 25-30, cycles 30-35, or cycles 35-40. After the initialcycles of amplification, part of the primers may be incorporated intothe products from each loci of interest, thus the TM can be recalculatedbased on the part of the primer incorporated into the product.

Reaction Mixture for Allele Detection

The term “reaction mixture for allele detection” as used hereingenerally refers to a mixture of components necessary to amplify atleast one amplicon from nucleic acid template molecules. The mixture forallele detection may comprise nucleotides (dNTPs), a polymerase andprimers. The mixture for allele detection may further comprise a Trisbuffer, a monovalent salt, and Mg2+. The concentration of each componentis well known in the art and can be further optimized by an ordinaryskilled artisan. In some embodiments, the reaction mixture for alleledetection also comprises additives including, but not limited to,non-specific background/blocking nucleic acids (e.g., salmon sperm DNA),biopreservatives (e.g. sodium azide), PCR enhancers (e.g. Betaine,Trehalose, etc.), and inhibitors (e.g. RNAse inhibitors). In someembodiments, a nucleic acid sample is admixed with the reaction mixturefor allele detection. Accordingly, in some embodiments the reactionmixture for allele detection further comprises a nucleic acid sample.

Amplification

The method can comprise amplification of template nucleic acid in thereaction mixture for allele detection. In some embodiments amplificationis carried out utilizing a nucleic acid polymerase. The nucleic acidpolymerase can be a DNA polymerase. The DNA polymerase can be athermostable DNA polymerase.

Some aspects of the allele detection methods described herein relate tothe ability of a DNA polymerase to separate a detectable moiety andquencher moiety in a reporter probe. Exemplary reporter probes aredescribed herein. Separation of the detectable and quencher moiety canoccur by cleavage of the reporter probe by the DNA polymerase. Cleavageof the reporter probe can occur by a exonuclease activity of the DNApolymerase. Accordingly, in some embodiments, the DNA polymerasecomprises 5′→3′ exonuclease activity. As used herein, “5′→3′ nucleaseactivity” or “5′ to 3′ nuclease activity” can refer to an activity of atemplate-specific nucleic acid polymerase whereby nucleotides areremoved from the 5′ end of an oligonucleotide in a sequential manner.DNA polymerases with 5′→3′ exonuclease activity are known in the art andinclude, e.g., DNA polymerase isolated from Thermus aquaticus (Taq DNApolymerase).

Some aspects of the allele detection methods described herein furtherrelate to the discriminative ability of a primer to be extended by anucleic acid polymerase (e.g., a DNA polymerase) in an amplificationstep, depending on the presence or absence of a mismatch between theterminal 3′ base of the primer and its hybridized templatepolynucleotide. In cases wherein there is no mismatch between theterminal 3′ base of the primer and template nucleotide, extension of theprimer by DNA polymerase can efficiently occur during an amplificationreaction. In cases wherein there is a mismatch between the terminal 3′base of the primer and template nucleotide (e.g., the bases are notcomplementary), extension of the primer by DNA polymerase does notoccur. In some embodiments extension of the mismatched primer does notoccur if the DNA polymerase lacks 3′→5′ exonuclease activity. 3′→5′exonuclease activity, as used herein, generally refers to an activity ofa DNA polymerase whereby the polymerase recognizes a mismatched basepairand moves backward by one base to excise the incorrect nucleotide.Accordingly, the DNA polymerase can lack 3′→5′ exonuclease activity.Exemplary DNA polymerases lacking 3′→5′ exonuclease activity include,but are not limited to BST DNA polymerase I, BST DNA polymerase I (largefragment), Taq polymerase, Streptococcus pneumoniae DNA polymerase I,Klenow Fragment (3′→5′ exo-), PyroPhage® 3173 DNA Polymerase,Exonuclease Minus (Exo-) (available from Lucigen), T4 DNA Polymerase,Exonuclease Minus (Lucigen). In some embodiments, the DNA polymerase isa recombinant DNA polymerase that has been engineered to lackexonuclease activity.

In other embodiments, extension of the mismatched primer by DNApolymerase does not occur wherein the DNA polymerase has 3′→5′exonuclease activity. In particular embodiments, extension of themismatched primer by DNA polymerase having 3′→5′ exonuclease activitydoes not occur if the 3′ terminal region of the mismatch primercomprises nucleotides linked by phosphorothioates linkages. Exemplaryprimers comprising nucleotides linked by phosphorothioates linkages aredescribed herein.

In some embodiments, the PCR process is carried out as an automatedprocess wherein the reaction mixture comprising template DNA is cycledthrough a denaturing step, a reporter probe and primer annealing step,and a synthesis step, whereby cleavage and displacement occurssimultaneously with primer-dependent template extension. The automatedprocess may be carried out using a PCR thermal cycler. Commerciallyavailable thermal cycler systems include systems from Bio-RadLaboratories, Life technologies, Perkin-Elmer, among others.

Repeated cycles of denaturation, primer/probe annealing, primerextension, and reporter probe cleavage can result in the exponentialaccumulation of detectable signal. Sufficient cycles are run to achievedetection of the detectable signal, which can be several orders ofmagnitude greater than background signal.

The present disclosure is compatible, however, with other amplificationsystems, such as the transcription amplification system, in which one ofthe PCR primers encodes a promoter that is used to make RNA copies ofthe target sequence. In similar fashion, the present disclosure can beused in a self-sustained sequence replication (3SR) system, in which avariety of enzymes are used to make RNA transcripts that are then usedto make DNA copies, all at a single temperature. By incorporating apolymerase with 5′→3′ exonuclease activity into a ligase chain reaction(LCR) system, together with appropriate primer/probe sets, one can alsoemploy the present disclosure to detect LCR products.

FIG. 6 depicts an exemplary embodiment of a method of the presentdisclosure. In a first step 601, a DNA sample comprising template DNAmolecules 602 and 603 are contacted with a reaction mixture comprisingdNTPs (not shown), a thermostable DNA polymerase 609 comprising 5′→3′exonuclease activity and not comprising 3′→5′ exonuclease activity, aforward primer F1 comprising a probe-binding region 605 and a templatebinding region 606, and a reverse primer R. The 3′ terminal base of theforward primer F1 is complementary to a mutant allele 607 which resideson template molecule 602. By contrast, template molecule 603 has awild-type allele 608 which is mismatched to the 3′ terminal base offorward primer F1. Also comprised in the reaction mixture is a reporterprobe P which comprises a 5′ fluorescent moiety (triangle) and a 3′quencher moiety (circle). In a first round of amplification (step 620),an annealing step is carried out wherein reporter probe P hybridizes toprobe-binding region 605, resulting in a primer/reporter duplex P/F1.Additionally, F1 hybridizes to template molecules 602 and 603, resultingin complexes P/F1/102 and P/F1/103. During a synthesis step, DNApolymerase 609 promotes efficient extension of the P/F1/102 complex dueto complementarity of the 3′ terminal base of F1 with mutant allele 607.The extension of F1 from template molecule 602 results in a chimericextension product comprising the extended primer F1 and the hybridizedreporter probe P. The extended primer F1 further comprises a primerbinding site for reverse primer R. By contrast, extension of P/F1/103does not occur because of a mismatch between wild-type allele 608 andthe 3′ terminal base of F1. Accordingly, no chimeric extension productcomprising the extended primer F1 and hybridized reporter probe P isproduced from a template molecule containing the wild-type allele. In asecond (and any subsequent round) of amplification (step 630), reverseprimer R hybridizes to the chimeric extension product. DNA polymerase609 promotes extension of reverse primer R, and the 5′→3′ exonucleaseactivity of polymerase 609 separates the fluorescent moiety from thequencher moiety, e.g., by hydrolysis, resulting in a detectable signal.In some embodiments, the probe P is not a low Tm probe. In someembodiments, the probe P is a low Tm probe comprising: a detectablemoiety; a quencher moiety; and a melting temperature (Tm) below 50° C.In some embodiments, the low Tm probe has a length of 8-30 nucleotides.In some embodiments, the detectable moiety is quenched at a temperatureof 55° C. or higher In some embodiments, the low Tm probe does nothybridize to a complementary template nucleic acid at an ambienttemperature above 55° C. In some embodiments, the quencher moietyquenches the detectable moiety if the probe is not hybridized to atemplate strand. In some embodiments, the Tm of the low Tm probe isbetween 30-45° C. In some embodiments, the fluorophore moiety andquencher moiety low Tm probe are spaced at least seven nucleotidesapart. In some embodiments, the low Tm probe comprises a nucleotide witha Tm enhancing base. In some embodiments the nucleotide with a Tmenhancing base is a Superbase, locked nucleotide, or bridge nucleotide.

In some embodiments, a reaction mixture can comprise multiple primersand probes for multiplex detection. By way of example only, a reactionmixture can comprise a common reverse primer and two or more forwardprimers, wherein each of the forward primers hybridizes to the sameregion in the template polynucleotide but differs from the other forwardprimers in the 5′ probe-binding region, wherein each forward primercomprises a unique probe-binding region, and wherein the templatebinding region of each of the forward primers differs from the otherforward primers in the 3′ terminal base, which is complementary toeither a wild-type allele or to one or another mutant alleles.Accordingly, the reaction mixture can also comprise two or moredifferent reporter probes, each probe having a sequence corresponding toone of the two or more unique probe-binding regions on the two or moreforward primers and comprising a distinct detectable moiety that isdetectably distinct from any other detectable moiety in the reactionmixture. In some embodiments, the probe P1 and P2 are not low Tm probes.In some embodiments, the probe P1 and P2 are low Tm probes eachcomprises a detectable moiety; a quencher moiety; and a meltingtemperature (Tm) below 50° C. In some embodiments, the low Tm probe hasa length of 8-30 nucleotides. In some embodiments, the detectable moietyis quenched at a temperature of 55° C. or higher In some embodiments,the low Tm probe does not hybridize to a complementary template nucleicacid at an ambient temperature above 55° C. In some embodiments, thequencher moiety quenches the detectable moiety if the probe is nothybridized to a template strand. In some embodiments, the Tm of the lowTm probe is between 30-45° C. In some embodiments, the fluorophoremoiety and quencher moiety low Tm probe are spaced at least sevennucleotides apart. In some embodiments, the low Tm probe comprises anucleotide with a Tm enhancing base. In some embodiments the nucleotidewith a Tm enhancing base is a Superbase, locked nucleotide, or bridgenucleotide. An exemplary embodiment of a multiplex assay detectingmultiple alleles at a single locus is depicted in FIG. 7. In a firststep 740, a DNA sample comprising template DNA molecules 702 and 703 arecontacted with a reaction mixture comprising dNTPs (not shown), athermostable DNA polymerase 709 comprising 5′→3′ exonuclease activityand not 3′→5′ exonuclease activity, a forward primer F1 comprising aprobe-binding region 705 and a template binding region 706, a forwardprimer F2 comprising a probe-binding region 710 and a template bindingregion 711. The template binding regions 706 and 711 are identicalexcept for the 3′ terminal base, which in F1 is complementary to amutant allele 707 which resides on template molecule 702 and in F2 iscomplementary to a wild-type allele 708 which resides on templatemolecule 703. Accordingly, there is a mismatch between the 3′ terminalbase of 706 and wild-type allele 708, and a mismatch between the 3′terminal base of 711 and mutant allele 707. Also comprised in thereaction mixture is reporter probe P1 which comprises a 5′ fluorescentmoiety (triangle) and a 3′ quencher moiety (circle) and reporter probeP2 which comprises a spectrally distinct 5′ fluorescent moiety (square)and a 3′ quencher moiety (circle). The reporter probe P1 hybridizes toprobe-binding region 705, resulting in a P1/F1 duplex, and reporterprobe P2 hybridizes to probe-binding region 710, resulting in a P2/F2duplex. In a first round of amplification (step 750), F1 and F2hybridize to template molecules 702 and 703, which can result inP1/F1/702, P1/F1/703, P2/F2/702, and P2/F2/703 complexes. DNA polymerase709 can promote efficient extension of P1/F1/702 and P2/F2/703, whichcan result in chimeric extension products comprising the extended primerF1 and the hybridized reporter probe P1 (F1-P1) and/or extended primerF2 and the hybridized reporter probe P2 (F2-P2), respectively. Theextended primers F1-P1 and F2-P2 may each further comprise a primerbinding site for reverse primer R. By contrast, no extension ofP1/F1/703 or P2/F2/702 occurs due to the presence of a mismatch betweenthe 3′ terminal base of the forward primers and the template DNA.Accordingly, no chimeric extension product comprising the extendedprimer F1 and hybridized reporter probe P2 or comprising extended primerF2 and hybridized reporter P1 is produced. In a second (and anysubsequent round) of amplification (step 760), reverse primer R canhybridize to the chimeric extension products F1-P1 and F2-P2. DNApolymerase 709 can promote extension of reverse primer R, and the 5′→3′exonuclease activity of polymerase 709 separates the fluorescent moietyfrom the quencher moiety of each probe P1 and P2, resulting inspectrally distinct signals 731 and 732.

By way of other example only, a reaction mixture can comprise aplurality of primer/probe sets, wherein each set comprises a pluralityof forward primers for the detection of multiple alleles at a particularlocus, each forward primer harboring a unique probe-binding sequence anda template binding region, the 3′ terminal base of the template bindingregion corresponding to an allele of the locus, a common reverse primer,and detectably distinct reporter probes specific for each forward primerin the set. Such a reaction mixture can be used for the multiplexdetection of multiple alleles at a plurality of loci. Accordingly, insome embodiments the disclosure provides a method of detecting up to 2,3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 alleles ina single multiplex assay.

In some embodiments, a reaction mixture comprises a plurality ofprimer/probe sets, wherein each set comprises a forward primer harboringa unique probe-binding sequence and a template binding region, a reverseprimer that binds to a region downstream of said forward primer, and adetectably distinct reporter probe specific for the forward primer. Sucha reaction mixture can be used for the multiplex detection of multipleloci. Multiplex detection of multiple loci can be used to assay copynumber variation. For example, a first locus can be a region suspectedof having a copy number variation and second locus can be a region thatis predicted to not have a copy number variation. Comparison ofdetectable signal corresponding to the first and second loci can be usedto measure copy number variation.

The detectable signal can be monitored in real-time during eachamplification cycle. As used herein, “real-time PCR” can refer to PCRmethods wherein an amount of detectable signal is monitored with eachcycle of PCR. In some embodiments, a cycle threshold (Ct) wherein adetectable signal reaches a detectable level is determined. Generally,the lower the Ct value, the greater the concentration of theinterrogated allele. Generally, data is collected during the exponentialgrowth (log) phase of PCR, wherein the quantity of the PCR product isdirectly proportional to the amount of template nucleic acid. Systemsfor real-time PCR are known in the art and include, e.g., the ABI 7700and 7900HT Sequence Detection Systems (Applied Biosystems, Foster City,Calif.). The increase in signal during the exponential phase of PCR canprovide a quantitative measurement of the amount of templates containingthe mutant allele.

In other embodiments, the detectable signal is monitored afteramplification cycles have terminated (e.g., endpoint detection).

Partitioning/Digital PCR

The method also can comprise partitioning the reaction mixture andnucleic acid sample into discrete volumes prior to amplification.Discrete volumes can contain template nucleic acid molecules from astarting nucleic acid sample. The starting nucleic acid sample can bediluted such that discrete volumes contain on average less than five,four, three, two, or one nucleic acid molecule. Partitions can containno nucleic acid molecule. Partitions with no nucleic acids enable theuse of Poisson statistics to determine original input DNA concentration.In some embodiments, discrete volumes can comprise a reaction mixture.Reaction mixtures are described herein. The method can comprisepartitioning a nucleic acid sample into one set of discrete volumes,partitioning a reaction mixture into a second set of discrete volumes,and merging single discrete volumes from the first set with singlediscrete volumes from the second set to produce merged discrete volumescomprising a template nucleic acid molecule and a reaction mixture. Inother embodiments, the method comprises admixing a nucleic acid samplewith a reaction mixture to produce an admixture, and partitioning theadmixture into discrete volumes. Discrete volumes can be independentlyassayed for the detection of one or more alleles.

Specific methods for partitioning are not critical to the practice ofthe disclosure. For example, partitioning can be carried out by manualpipetting. In a particular example, reaction mixture and nucleic acidsample can be distributed to individual tubes or well by manualpipetting. In another example, robotic methods can be used for thepartitioning step. Microfluidic methods can also be used for thepartitioning step.

A discrete volume can be, e.g., a tube, a well, a perforated hole, areaction chamber, or a droplet, such as a droplet of an aqueous phasedispersed in an immiscible liquid, such as described in U.S. Pat. No.7,041,481. Discrete volumes can be arranged into arrays of discretevolumes. Exemplary arrays include the Open array digital PCR system byLife Technologies (described intools.invitrogen.com/content/sfs/manuals/cms_088717. pdf) and arraysystems by Fluidigm (www.fluidigm.com).

Partitioning a sample into small reaction volumes can confer manyadvantages. For example, the partitioning may enable the use of reducedamounts of reagents, thereby lowering the material cost of the analysis.By way of other example, partitioning can also improve sensitivity ofdetection. Without wishing to be bound by theory, partitioning of thereaction mixture and template DNA into discrete reaction volumes cangive rare molecules greater proportional access to reaction reagents,thereby enhancing detection of rare molecules. For example, partitioningcan enable the detection of a rare allele in a background of highwild-type allelic ratio. Accordingly, in some embodiments a reactionvolume can be less than 1 ml, less than 500 microliters (ul), less than100 ul, less than 10 ul, less than 1 ul, less than 0.5 ul, less than 0.1ul, less than 50 nl, less than 10 nl, less than 1 nl, less than 0.1 nl,less than 0.01 nl, less than 0.001 nl, less than 0.0001 nl, less than0.00001 nl, or less than 0.000001 nl. In some embodiments, a reactionvolume can be 1-100 picoliters (pl), 50-500 pl, 0.1-10 nanoliters (nl),1-100 nl, 50-500 nl, 0.1-10 microliters (ul), 5-100 ul, 100-1000 ul, ormore than 1000 ul. In some embodiments, the reaction volumes aredroplets. Without wishing to be bound by theory, the use of smalldroplets can enable the processing of large numbers of reactions inparallel. Accordingly, in some cases, the droplets have an averagediameter of about, 0.000000000000001, 0.0000000000001, 0.00000000001,0.000000001, 0.0000001, 0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.05,0.1, 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 100, 120, 130, 140, 150, 160,180, 200, 300, 400, or 500 microns.

In some embodiments, the method comprises detection and/or measurementof an allele by digital PCR. The term “digital PCR”, as used herein,generally refers to a PCR amplification which is carried out on anominally single, selected template molecule, wherein a number ofindividual single molecules are each isolated into discrete reactionvolumes. In some embodiments, a large number of reaction volumes areused to produce higher statistical significance. Generally, PCRamplification in a reaction volume containing at least a single template(such as, e.g., a well, chamber, bead, emulsion, etc.) can have either anegative result, e.g., no detectable signal if no starting molecule ispresent, or a positive result, e.g., a detectable signal, if thetargeted starting molecule is present. By analyzing a number of reactionareas indicating a positive result, insight into the number of startingmolecules can be obtained. Such an analysis can be used for measurementof an amount of wild-type or mutant alleles in a sample, or be used fora measurement of copy number variation of a locus in a sample.

In particular embodiments, the method comprises droplet digital PCRmethods. “Droplet digital PCR” generally refers to digital PCR whereinthe reaction volumes are droplets. The droplets provided herein canprevent mixing between reaction volumes.

The droplets described herein can include emulsion compositions. Theterm “emulsion”, as used herein, generally refers to a mixture ofimmiscible liquids (such as oil and an aqueous solution, e.g., water).In some embodiments, the emulsion comprise aqueous droplets within acontinuous oil phase. In other embodiments, the emulsion comprises oildroplets within a continuous aqueous phase. The mixtures or emulsionsdescribed herein may be stable or unstable. In preferred embodiments,the emulsions are relatively stable.

In some embodiments the emulsions exhibit minimal coalescence.“Coalescence” refers to a process in which droplets combine to formprogressively larger droplets. In some cases, less than 0.00001%,0.00005%, 0.00010%, 0.00050%, 0.001%, 0.005%, 0.01%, 0.05%, 0.1%, 0.5%,1%, 2%, 2.5%, 3%, 3.5%, 4%, 4.5%, 5%, 6%, 7%, 8%, 9%, or 10% of dropletsexhibit coalescence. The emulsions may also exhibit limitedflocculation, a process by which the dispersed phase comes out ofsuspension in flakes. In some cases, less than 0.00001%, 0.00005%,0.00010%, 0.00050%, 0.001%, 0.005%, 0.01%, 0.05%, 0.1%, 0.5%, 1%, 2%,2.5%, 3%, 3.5%, 4%, 4.5%, 5%, 6%, 7%, 8%, 9%, or 10% of droplets exhibitflocculation.

The droplets can either be monodisperse (e.g., of substantially similarsize and dimensions) or polydisperse (e.g., of substantially variablesize and dimensions. In some embodiments, the droplets are monodispersedroplets. In some cases, the droplets are generated such that the sizeof the droplets does not vary by more than plus or minus 5% of theaverage size of the droplets. In some cases, the droplets are generatedsuch that the size of the droplets does not vary by more than plus orminus 2% of the average size of the droplets. In some cases, a dropletgenerator will generate a population of droplets from a single sample,wherein none of the droplets vary in size by more than plus or minus0.1%, 0.5%, 1%, 1.5%, 2%, 2.5%, 3%, 3.5%, 4%, 4.5%, 5%, 5.5%, 6%, 6.5%,7%, 7.5%, 8%, 8.5%, 9%, 9.5%, or 10% of the average size of the totalpopulation of droplets.

In some embodiments, the present disclosure provides systems, devices,and methods for droplet generation. In some embodiments, microfluidicsystems are configured to generate monodisperse droplets (see, e.g.,Kiss et al. Anal Chem. 2008 Dec. 1; 80(23): 8975-8981). In someembodiments, the present disclosure provides micro fluidics systems formanipulating and/or partitioning samples.

In some embodiments, a microfluidics system comprises one or more ofchannels, valves, pumps, etc. (U.S. Pat. No. 7,842,248, hereinincorporated by reference in its entirety). In some embodiments, amicrofluidics system is a continuous-flow microfluidics system (see,e.g., Kopp et al., Science, vol. 280, pp. 1046-1048, 1998, herebyincorporated by reference). In some embodiments, microarchitecture ofthe present disclosure includes, but is not limited to microchannels,microfluidic plates, fixed microchannels, networks of microchannels,internal pumps; external pumps, valves, centrifugal force elements, etc.In some embodiments, the microarchitecture of the present disclosure(e.g. droplet microactuator, microfluidics platform, and/orcontinuous-flow microfluidics) is complemented or supplemented withdroplet manipulation techniques, including, but not limited toelectrical (e.g., electrostatic actuation, dielectrophoresis), magnetic,thermal (e.g., thermal Marangoni effects, thermocapillary), mechanical(e.g., surface acoustic waves, micropumping, peristaltic), optical(e.g., opto-electrowetting, optical tweezers), and chemical means (e.g.,chemical gradients). In some embodiments, a droplet microactuator issupplemented with a microfluidics platform (e.g. continuous flowcomponents) and such combination approaches involving discrete dropletoperations and microfluidics elements are within the scope of thedisclosure.

In some embodiments, methods of the disclosure utilize a dropletmicroactuator. In some embodiments, a droplet microactuator is capableof effecting droplet manipulation and/or operations, such as, e.g.,dispensing, splitting, transporting, merging, mixing, agitating. In someembodiments the disclosure employs droplet operation structures andtechniques described in, e.g., U.S. Pat. Nos. 6,911,132, 6,773,566, and6,565,727; U.S. patent application Ser. No. 11/343,284, and U.S. PatentPublication No. 20060254933, all of which are hereby incorporated byreference.

Droplet digital PCR techniques enable a high density of discrete PCRamplification reactions in a single volume. In some embodiments, greaterthan 100,000, 500,000, 1,000,000, 1,500,000, 2,000,000, 2,500,000,5,000,000, or 10,000,000 separate reactions may occur per ul.

Detection

Fluorescence detection can be achieved using a variety of detectordevices equipped with a module to generate excitation light that can beabsorbed by a fluorescer, as well as a module to detect light emitted bythe fluorescer. In some cases, samples (such as droplets) may bedetected in bulk. For example, samples may be allocated in plastic tubesthat are placed in a detector that measures bulk fluorescence fromplastic tubes. The samples can be distributed in a monolayer. Monolayerdistributed samples can be detected by scanning users high resolutionscanners (e.g., microarray scanners, GenePix 4000B Microarray Scanner(Molecular Devices), SureScan Microarray Scanner (Agilent)). If thesample is distributed in multiple layers, the sample can be detectedwith confocal imaging (e.g., confocal microscopy, spinning-disk confocalmicroscopy, confocal laser scanning microscopy). In some cases, one ormore samples (such as droplets) may be partitioned into one or morewells of a plate, such as a 96-well or 384-well plate, and fluorescenceof individual wells may be detected using a fluorescence plate reader.

In some embodiments amplification of the droplets, e.g., in a thermalcycle results in the generation of one or more detectable signals in anumber of droplets. During the amplification reaction, a dropletcomprising a template DNA molecule containing an interrogated allele canexhibit an increase in fluorescence relative to droplets that do notcontain an interrogated allele. Droplets can be processed individuallyand fluorescence data collected from the droplets. For example, datarelating to fluorescent signals from spectrally distinct fluorophoresmay be collected from each droplet.

A number of commercial instruments are available for analysis offluorescently labeled materials. For instance, the ABI Gene Analyzer canbe used to analyze attomole quantities of DNA tagged with fluorophoressuch as ROX (6-carboxy-X-rhodamine), rhodamine-NHS, TAMRA(5/6-carboxytetramethyl rhodamine NHS), and FAM (5′-carboxyfluoresceinNHS). These compounds are attached to the probe by an amide bond througha 5′-alkylamine on the probe. Attachment can also occur throughphosphoramidite precursors (e.g.,2-methoxy-3-trifluoroacetyl-1,3,2-oxazaphosphacyclopentane orN-(3-(N′,N′-diisopropylaminomethoxyphosphinyloxy)propyl)-2,2,2-trifluoroacetamide)which is a method to conjugate amino-derivatized polymers, especiallyoligonucleotides. Other useful fluorophores include CNHS(7-amino-4-methyl-coumarin-3-acetic acid, succinimidyl ester), which canalso be attached through an amide bond.

Following digital PCR, the number of positive samples having aparticular allele and the number of positive samples having any otherallele (e.g., a wild-type allele) can be counted. In some cases,quantitative determinations are made by measuring the fluorescenceintensity of individual partitions, while in other cases, measurementsare made by counting the number of partitions containing detectablesignal. In some embodiments, control samples can be included to providebackground measurements that can be subtracted from all the measurementsto account for background fluorescence. In other embodiments, 1, 2, 3,4, 5, 6, 7, 8, 9, 10, or more than 10 different colors can be used todetect and measure different alleles, such as by using fluorophores ofdifferent colors on different PCR primers matched to probes recognizingdifferent sequences.

In another embodiment of the disclosure, detection of a hydrolyzedreporter probe can be accomplished using, for example, luminescence(e.g., using Yttrium or Berrilium conjugates of EDTA), time-resolvedfluorescence spectroscopy, a technique in which fluorescence ismonitored as a function of time after excitation, or fluorescencepolarization, a technique to differentiate between large and smallmolecules based on molecular tumbling. Large molecules (e.g., intactlabeled probe) tumble in solution much more slowly than small molecules.Upon linkage of a fluorescent moiety to the molecule of interest (e.g.,the 5′ end of a labeled probe), this fluorescent moiety can be measured(and differentiated) based on molecular tumbling, thus differentiatingbetween intact and digested probe. Detection may be measured directlyduring PCR or may be performed post PCR.

Kits for Allele Detection

Also provided in the disclosure are kits for the detection of one ormore alleles of a locus. Kits may include one or more oligonucleotideprimers as described herein, wherein each of the primers is capable ofselectively detecting an individual allele of a locus. Kits may alsoinclude one or more reporter probes, as described herein. Kits caninclude, for example, one or more primer/probe sets. Exemplaryprimer/probe sets are described herein. Kits may further compriseinstructions for use of the one or more primer/probe sets, e.g.,instructions for practicing a method of the disclosure. In someembodiments, the kit includes a packaging material. As used herein, theterm “packaging material” can refer to a physical structure housing thecomponents of the kit. The packaging material can maintain sterility ofthe kit components, and can be made of material commonly used for suchpurposes (e.g., paper, corrugated fiber, glass, plastic, foil, ampules,etc.). Kits can also include a buffering agent, a preservative, or aprotein/nucleic acid stabilizing agent. Kits can also include othercomponents of a reaction mixture as described herein. For example, kitsmay include one or more aliquots of thermostable DNA polymerase asdescribed herein, and/or one or more aliquots of dNTPs. Kits can alsoinclude control samples of known amounts of template DNA moleculesharboring the individual alleles of a locus. In some embodiments, thekit includes a negative control sample, e.g., a sample that does notcontain DNA molecules harboring the individual alleles of a locus. Insome embodiments, the kit includes a positive control sample, e.g., asample containing known amounts of one or more of the individual allelesof a locus.

Systems for Allele Detection

Also provided in the disclosure are systems for the detection of one ormore alleles in a sample. The system can provide a reaction mixture asdescribed herein. In some embodiments the reaction mixture is admixedwith a DNA sample and comprises template DNA. In some embodiments, thesystem further provides a droplet generator, which partitions thetemplate DNA molecules, probes, primers, and other reaction mixturecomponents into multiple droplets within a water-in-oil emulsion.Examples of some droplet generators useful in the present disclosure areprovided in International Application No. PCT/US2009/005317. The systemcan further provide a thermocycler, which reacts the droplets via, e.g.,PCR, to allow amplification and generation of one or more detectablesignals. During the amplification reaction, a droplet comprising atemplate DNA molecule containing an interrogated allele exhibits anincrease in fluorescence relative to droplets that do not contain aninterrogated allele. In some embodiments, the system further provides adroplet reader, which processes the droplets individually and collectsfluorescence data from the droplets. The droplet reader may, forexample, detect fluorescent signals from spectrally distinctfluorophores. In some cases, the droplet reader further compriseshandling capabilities for droplet samples, with individual dropletsentering the detector, undergoing detection, and then exiting thedetector. For example, a flow cytometry device can be adapted for use indetecting fluorescence from droplet samples. In some cases, amicrofluidic device equipped with pumps to control droplet movement isused to detect fluorescence from droplets in single file. In some cases,droplets are arrayed on a two-dimensional surface and a detector movesrelative to the surface, detecting fluorescence at each positioncontaining a single droplet. Exemplary droplet readers useful in thepresent disclosure are provided in International Application No.PCT/US2009/005317.

Other exemplary systems for use with the method of the disclosure isdescribed, for example, PCT Patent Application Pubs. WO 2007/091228(U.S. Ser. No. 12/092,261); WO 2007/091230 (U.S. Ser. No. 12/093,132);and WO 2008/038259. Systems useful in practicing the disclosure include,e.g., systems from Stokes Bio (www.stokebio.ie), Fluidigm(www.fluidigm.com), Bio-Rad Laboratories, (www.bio-rad.com) RainDanceTechnologies (www.raindancetechnologies.com), Microfluidic Systems(www.microfluidicsystems.com); Nanostream (www.nanostream.com); andCaliper Life Sciences (www.caliperls.com). Other exemplary systemssuitable for use with the methods of the disclosure are described, forexample, in Zhang et al. Nucleic Acids Res., 35(13):4223-4237 (2007),Wang et al., J. Micromech. Microeng., 15:1369-1377 (2005); Jia et al.,38:2143-2149 (2005); Kim et al., Biochem. Eng. J., 29:91-97; Chen etal., Anal. Chem., 77:658-666; Chen et al., Analyst, 130:931-940 (2005);Munchow et al., Expert Rev. Mol. Diagn., 5:613-620 (2005); and Charbertet al., Anal. Chem., 78:7722-7728 (2006); and Dorfman et al., Anal.Chem, 77:3700-3704 (2005).

In some embodiments, the system further comprises a computer whichstores and processes data. A computer-executable logic may be employedto perform such functions as subtraction of background fluorescence,assignment of target and/or reference sequences, and quantification ofthe data. For example, the number of droplets containing fluorescencecorresponding to the presence of a particular allele (e.g., a mutantallele) in the sample may be counted and compared to the number ofdroplets containing fluorescence corresponding to the presence ofanother allele at the locus (such as, e.g., a wild-type allele).

Subject-Specific Report

In some embodiments, methods for assessing cancer as described hereinfurther comprise generating a subject-specific report on the tumorprofile. The tumor profile can comprise a mutational status of one ormore genes in the set of genes sequenced. The method can furthercomprise generation a subject-specific report on mutational status ofthe subset of genes over time. The subject-specific report can compriseinformation on dynamics of the tumor over time, based on a change in thelevel of cell-free DNA harboring the mutations in the subset of genesover time. An increase over time of cell-free DNA harboring themutations can indicate an increase in tumor or cancer burden. A decreaseover time of cell-free DNA harboring the mutations can indicate adecrease in tumor or cancer burden.

In some embodiments, the report provides a stratification and/orannotation of treatment options for the subject, based on the subject'stumor-specific profile. The stratification and/or the annotation can bebased on clinical information for the subject. The stratification caninclude ranking drug treatment options with a higher likelihood ofefficacy higher than drug treatment options with a lower likelihood ofefficacy or for which no information exists with regard to treatingsubjects with the determined status of the one or more molecularmarkers. The stratification can include indicating on the report one ormore drug treatment options for which scientific information suggeststhe one or more drug treatment options will be efficacious in a subject,based on the status of one or more tumor-specific mutations from thesubject. The stratification can include indicating on a report one ormore drug treatment options for which some scientific informationsuggests the one or more drug treatment options will be efficacious inthe subject, and some scientific information suggests the one or moredrug treatment options will not be efficacious in the subject, based onthe status of one or more tumor-specific mutations in the sample fromthe subject. The stratification can include indicating on a report oneor more drug treatment options for which scientific informationindicates the one or more drug treatment options will not be efficaciousfor the subject, based on the status of one or more tumor-specificmutations in the sample from the subject. The stratification can includecolor coding the listed drug treatment options on the report based onthe rank of the predicted efficacy of the drug treatment options.

The annotation can include annotation a report for a condition in theNCCN Clinical Practice Guidelines in Oncology™ or the American Societyof Clinical Oncology (ASCO) clinical practice guidelines. The annotationcan include listing one or more FDA-approved drugs for off-label use,one or more drugs listed in a Centers for Medicare and Medicaid Services(CMS) anti-cancer treatment compendia, and/or one or more experimentaldrugs found in scientific literature, in the report. The annotation caninclude connecting a listed drug treatment option to a referencecontaining scientific information regarding the drug treatment option.The scientific information can be from a peer-reviewed article from amedical journal. The annotation can include using information providedby Ingenuity® Systems. The annotation can include providing a link toinformation on a clinical trial for a drug treatment option in thereport. The annotation can include presenting information in a pop-upbox or fly-over box near provided drug treatment options in anelectronic based report. The annotation can include adding informationto a report selected from the group consisting of one or more drugtreatment options, scientific information concerning one or more drugtreatment options, one or more links to scientific information regardingone or more drug treatment options, one or more links to citations forscientific information regarding one or more drug treatment options, andclinical trial information regarding one or more drug treatment options.An exemplary embodiment of a subject-specific report is depicted in FIG.8.

Computer Systems

In another aspect, the disclosure provides computer systems for themonitoring of a cancer, generating a subject report, and/orcommunicating the report to a caregiver. In some embodiments, thedisclosure provides computer systems for determining prognosis ordetermining efficacy of a therapy for a cancer in a subject in needthereof. The computer system can provide a report communicating saidprognosis or therapy efficacy for said cancer. In some embodiments, thecomputer system executes instructions contained in a computer-readablemedium. In some embodiments, the processor is associated with one ormore controllers, calculation units, and/or other units of a computersystem, or implanted in firmware. In some embodiments, one or more stepsof the method are implemented in hardware. In some embodiments, one ormore steps of the method are implemented in software. Software routinesmay be stored in any computer readable memory unit such as flash memory,RAM, ROM, magnetic disk, laser disk, or other storage medium asdescribed herein or known in the art. Software may be communicated to acomputing device by any known communication method including, forexample, over a communication channel such as a telephone line, theinternet, a wireless connection, or by a transportable medium, such as acomputer readable disk, flash drive, etc. The one or more steps of themethods described herein may be implemented as various operations,tools, blocks, modules and techniques which, in turn, may be implementedin firmware, hardware, software, or any combination of firmware,hardware, and software. When implemented in hardware, some or all of theblocks, operations, techniques, etc. may be implemented in, for example,an application specific integrated circuit (ASIC), custom integratedcircuit (IC), field programmable logic array (FPGA), or programmablelogic array (PLA).

FIG. 9 depicts a computer system 900 adapted to enable a user to detect,analyze, and process patient data. The system 900 includes a centralcomputer server 901 that is programmed to implement exemplary methodsdescribed herein. The server 901 includes a central processing unit(CPU, also “processor”) 905 which can be a single core processor, amulti core processor, or plurality of processors for parallelprocessing. The server 901 also includes memory 910 (e.g. random accessmemory, read-only memory, flash memory); electronic storage unit 915(e.g. hard disk); communications interface 920 (e.g. network adaptor)for communicating with one or more other systems; and peripheral devices925 which may include cache, other memory, data storage, and/orelectronic display adaptors. The memory 910, storage unit 915, interface920, and peripheral devices 925 are in communication with the processor905 through a communications bus (solid lines), such as a motherboard.The storage unit 915 can be a data storage unit for storing data. Theserver 901 is operatively coupled to a computer network (“network”) 930with the aid of the communications interface 920. The network 930 can bethe Internet, an intranet and/or an extranet, an intranet and/orextranet that is in communication with the Internet, a telecommunicationor data network. The network 930 in some cases, with the aid of theserver 901, can implement a peer-to-peer network, which may enabledevices coupled to the server 901 to behave as a client or a server.

The storage unit 915 can store files, such as subject reports, and/orcommunications with the caregiver, sequencing data, data aboutindividuals, or any aspect of data associated with the disclosure.

The server can communicate with one or more remote computer systemsthrough the network 930. The one or more remote computer systems may be,for example, personal computers, laptops, tablets, telephones, Smartphones, or personal digital assistants.

In some situations the system 900 includes a single server 901. In othersituations, the system includes multiple servers in communication withone another through an intranet, extranet and/or the Internet.

The server 901 can be adapted to store sequencing information, orpatient information, such as, for example, polymorphisms, mutations,patient history and demographic data and/or other information ofpotential relevance. Such information can be stored on the storage unit915 or the server 901 and such data can be transmitted through anetwork.

Methods as described herein can be implemented by way of machine (orcomputer processor) executable code (or software) stored on anelectronic storage location of the server 901, such as, for example, onthe memory 910, or electronic storage unit 915. During use, the code canbe executed by the processor 905. In some cases, the code can beretrieved from the storage unit 915 and stored on the memory 910 forready access by the processor 905. In some situations, the electronicstorage unit 915 can be precluded, and machine-executable instructionsare stored on memory 910. Alternatively, the code can be executed on asecond computer system 940. The computer system 940 and the centralcomputer server 901 can be operated in the same geographical location.The computer system 940 and the central computer server 901 can beoperated in different geographical locations.

Aspects of the systems and methods provided herein, such as the server901, can be embodied in programming. Various aspects of the technologymay be thought of as “products” or “articles of manufacture” typicallyin the form of machine (or processor) executable code and/or associateddata that is carried on or embodied in a type of machine readablemedium. Machine-executable code can be stored on an electronic storageunit, such memory (e.g., read-only memory, random-access memory, flashmemory) or a hard disk. “Storage” type media can include any or all ofthe tangible memory of the computers, processors or the like, orassociated modules thereof, such as various semiconductor memories, tapedrives, disk drives and the like, which may provide non-transitorystorage at any time for the software programming. All or portions of thesoftware may at times be communicated through the Internet or variousother telecommunication networks. Such communications, for example, mayenable loading of the software from one computer or processor intoanother, for example, from a management server or host computer into thecomputer platform of an application server. Thus, another type of mediathat may bear the software elements includes optical, electrical, andelectromagnetic waves, such as used across physical interfaces betweenlocal devices, through wired and optical landline networks and overvarious air-links. The physical elements that carry such waves, such aswired or wireless likes, optical links, or the like, also may beconsidered as media bearing the software. As used herein, unlessrestricted to non-transitory, tangible “storage” media, terms such ascomputer or machine “readable medium” can refer to any medium thatparticipates in providing instructions to a processor for execution.

Hence, a machine readable medium, such as computer-executable code, maytake many forms, including but not limited to, tangible storage medium,a carrier wave medium, or physical transmission medium. Non-volatilestorage media can include, for example, optical or magnetic disks, suchas any of the storage devices in any computer(s) or the like, such maybe used to implement the system. Tangible transmission media caninclude: coaxial cables, copper wires, and fiber optics (including thewires that comprise a bus within a computer system). Carrier-wavetransmission media may take the form of electric or electromagneticsignals, or acoustic or light waves such as those generated during radiofrequency (RF) and infrared (IR) data communications. Common forms ofcomputer-readable media therefore include, for example: a floppy disk, aflexible disk, hard disk, magnetic tape, any other magnetic medium, aCD-ROM, DVD, DVD-ROM, any other optical medium, punch cards, paper tame,any other physical storage medium with patterns of holes, a RAM, a ROM,a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, acarrier wave transporting data or instructions, cables, or linkstransporting such carrier wave, or any other medium from which acomputer may read programming code and/or data. Many of these forms ofcomputer readable media may be involved in carrying one or moresequences of one or more instructions to a processor for execution.

The results of monitoring of a cancer, generating a subject report,and/or communicating the report to a caregiver can be presented to auser with the aid of a user interface, such as a graphical userinterface.

A computer system may be used for one or more steps, including, e.g.,sample collection, sample processing, sequencing, allele detection,receiving patient history or medical records, receiving and storingmeasurement data regarding a detected level of tumor-specific mutationsin a subject or sample obtained from a subject, analyzing saidmeasurement data determine a diagnosis, prognosis, or therapeuticefficacy, generating a report, and reporting results to a receiver.

A client-server and/or relational database architecture can be used inthe disclosure. In general, a client-server architecture is a networkarchitecture in which each computer or process on the network is eithera client or a server. Server computers can be powerful computersdedicated to managing disk drives (file servers), printers (printservers), or network traffic (network servers). Client computers caninclude PCs (personal computers) or workstations on which users runapplications, as well as example output devices as disclosed herein.Client computers can rely on server computers for resources, such asfiles, devices, and even processing power. The server computer handlesall of the database functionality. The client computer can have softwarethat handles front-end data management and receive data input fromusers.

After performing a calculation, a processor can provide the output, suchas from a calculation, back to, for example, the input device or storageunit, to another storage unit of the same or different computer system,or to an output device. Output from the processor can be displayed by adata display, e.g., a display screen (for example, a monitor or a screenon a digital device), a print-out, a data signal (for example, apacket), a graphical user interface (for example, a webpage), an alarm(for example, a flashing light or a sound), or a combination of any ofthe above. In an embodiment, an output is transmitted over a network(for example, a wireless network) to an output device. The output devicecan be used by a user to receive the output from the data-processingcomputer system. After an output has been received by a user, the usercan determine a course of action, or can carry out a course of action,such as a medical treatment when the user is medical personnel. In someembodiments, an output device is the same device as the input device.Example output devices include, but are not limited to, a telephone, awireless telephone, a mobile phone, a PDA, a flash memory drive, a lightsource, a sound generator, a fax machine, a computer, a computermonitor, a printer, an iPod, and a webpage. The user station may be incommunication with a printer or a display monitor to output theinformation processed by the server. Such displays, output devices, anduser stations can be used to provide an alert to the subject or to acaregiver thereof.

Data relating to the present disclosure can be transmitted over anetwork or connections for reception and/or review by a receiver. Thereceiver can be but is not limited to the subject to whom the reportpertains; or to a caregiver thereof, e.g., a health care provider,manager, other healthcare professional, or other caretaker; a person orentity that performed and/or ordered the genotyping analysis; a geneticcounselor. The receiver can also be a local or remote system for storingsuch reports (e.g. servers or other systems of a “cloud computing”architecture). In one embodiment, a computer-readable medium includes amedium suitable for transmission of a result of an analysis of abiological sample.

An exemplary embodiment of a subject-specific report is depicted in FIG.8. The computer system can comprise a user accessible module whichenables the ability for clinicians to request a service be performed.Clinicians can enter patient demographic and medical history informationinto the computer system. The computer system can process the enteredinformation and create a barcode label that can be applied to the samplebeing analyzed. The barcoded-sample be sent for analysis to a thirdparty analyzer. The barcoded information would be inaccessible to thethird party analyzer to maintain accountability with The HealthInsurance Portability and Accountability Act (HIPAA) compliancy.Information that can be anonymized can be accessible to the third partyanalyzer. The barcode can be used to track the progression of the samplethrough the analysis workflow resulting in the generation of anencrypted final report. The encrypted final report can be decrypted andmade accessible to the clinician who originally entered the sampleinformation.

Ligation Method:

In some aspects, the disclosure provides methods and kits for performinghighly efficient ligation reactions. In some embodiments, the methodscomprise ligation of donor nucleic acids to acceptor nucleic acids. Insome embodiments, the methods improve ligation efficiency by over2-fold, 5-fold, 10-fold, 50-fold, 100-fold, 500-fold, 1000-fold, or morethan 1000-fold as compared to current methods. The methods describedherein can, for example, increase ligation efficiency to over 10%, 20%,30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, or 99.9%efficiency. In some embodiments, the methods described herein canincrease the specificity of a ligation reaction, resulting in, forexample, over 30%, over 40%, over 50%, over 60%, over 70%, 80%, over85%, over 90%, over 95%, over 97%, over 98%, over 99%, over 99.5%, over99.9%, or substantially all of ligation products resulting from adesired donor-acceptor ligation, as compared to undesired ligationproducts, e.g., unwanted donor-donor or acceptor-acceptor concatamers.The methods described herein can result in ligation of over 50%, over60%, over 70%, over 80%, over 85%, over 90%, over 95%, over 97%, over98%, over 99%, over 99.5%, over 99.9%, or substantially all of theplurality of the donor or acceptor nucleic acid molecules, respectively,to the acceptor or donor nucleic acid molecules. A nucleic acid molecule(donor or acceptor) in the ligation reaction can be over 120 nucleotidesin length. Such highly efficient ligation methods can be used to improvea wide range of applications, some of which are described herein byexample.

FIG. 10A depicts an exemplary embodiment of a method of the disclosure.In a first step (1), the method comprises transferring a nucleotidemonophosphate (NMP) to an amount of donor nucleic acid molecules in areaction mixture for a time sufficient to effect an accumulation ofNMP-carrying donor nucleic acid molecules. In some embodiments, N=A. Insome embodiments, N=G. A donor nucleic acid molecule can comprise a 5′or 3′ phosphate group. In some embodiments, N=A, and a donor nucleicacid molecule comprises a 5′ phosphate group. In some embodiments, N=G,and a donor nucleic acid molecule comprises a 3′ phosphate group. Insome embodiments, the reaction results in transfer of NMP to at least10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the donor nucleic acidmolecules present in the reaction mixture. In a second step (2), themethod further comprises effecting formation of a covalent bond betweenan acceptor nucleic molecule and the NMP-carrying donor nucleic acidmolecule (e.g., ligating an acceptor nucleic acid molecule to theNMP-carrying donor nucleic acid molecule). In some embodiments, theadenylation and ligation steps are carried out serially in a singlereaction mixture. In some embodiments, the adenylated donor nucleic acidmolecules are not separated from the reaction mixture prior to thesecond step (e.g., ligation step). In some embodiments, enzyme (e.g.,ligase)/nucleic acid complexes are sedimented between adenylation andligation steps. In some embodiments, the first and second steps arecarried out serially in the reaction mixture. In some embodiments, theligation step is carried out after completion of the adenylation step.In some embodiments, the reaction mixture in which ligation occurscomprises a pH in a range of about pH 1-pH14. In some embodiments, thereaction mixture in which ligation occurs comprises a pH of at least pH7.1, pH 7.2, pH 7.3, pH 7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH 7.9, pH8, pH 8.1, pH 8.2, pH 8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH 8.8, pH8.9, pH 9, pH 9.5, pH 10, pH 10.5, pH 11, pH 11.5, pH 12, pH 12.5, pH13, or greater. In some embodiments, the reaction mixture in whichligation occurs comprises a neutral pH (pH 7.0). In some embodiments,the reaction mixture in which ligation occurs comprises a pH of about7.1 to about pH9, about pH 7.5 to about pH 9, about pH 8 to about pH 10,or about pH 7 to about pH 8. The pH of a reaction mixture in whichligation occurs can be less than pH 14, 13, 12, 11, 10, 9, 8, 7, 6.5, 6,5.5, 5, 4.5, 4, 3.5, 3, 2.5, 2, 1.5, or 1. The pH of a reaction mixturein which ligation occurs can be about pH 5 to about pH 6, about pH 4 toabout pH 5, about pH 3 to about pH 4, about pH 2 to about pH 3, or aboutpH 1 to about pH2.

In some embodiments, the reaction mixture in which adenylation occurscomprises a pH in a range of about pH 1 to pH14. In some embodiments,the reaction mixture, in which adenylation occurs comprises a pH of atleast pH 7.1, pH 7.2, pH 7.3, pH 7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH7.9, pH 8, pH 8.1, pH 8.2, pH 8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH8.8, pH 8.9, pH 9, pH 9.5, pH 10, pH 10.5, pH 11, pH 11.5, pH 12, pH12.5, pH 13, or greater. In some embodiments, the reaction mixture inwhich adenylation occurs comprises a neutral pH (pH 7.0). In someembodiments, the reaction mixture, in which adenylation occurs comprisesa pH of about pH 7.1 to about pH 9, about pH 7.5 to about pH 9, about pH8 to about pH 10, or about pH 7 to about pH 8. The pH of a reactionmixture in which adenylation occurs can be less than pH 14, 13, 12, 11,10, 9, 8, 7, 6.5, 6, 5.5, 5, 4.5, 4, 3.5, 3, 2.5, 2, 1.5, or 1. The pHof a reaction mixture in which ligation occurs can be about pH 5 toabout pH 6, about pH 4 to about pH 5, about pH 3 to about pH 4, about pH2 to about pH 3, or about pH 1 to about pH2.

In some embodiments, over 10%, over 20%, over 30%, over 40%, over 50%,over 60%, over 70%, over 80%, over 90%, over 95%, over 97%, over 98%,over 99%, over 99.5%, over 99.9%, or substantially all of the donornucleic acid molecules are carrying an NMP molecule upon commencement ofthe ligation step.

In some embodiments, an enzyme, e.g., a ligase, and a bound or complexednucleic acid, e.g., a single stranded donor nucleic acid that comprisesan NMP, e.g, a 5′ NMP or a 3′ NMP, is sedimented in a reaction mixture.The sedimentation can be performed after, or during, a reaction in whichan NMP is transferred to a donor nucleic acid molecule, e.g., a singlestranded donor nucleic acid molecule. For example, the sedimentation canbe performed during or after an adenylation reaction. In someembodiments, sedimentation is used to separate an enzyme, e.g., aligase, that is not bound to or complexed with a nucleic acid, from anenzyme, e.g., a ligase, that is bound to a nucleic acid. In someembodiments, sedimentation is used to separate free NTP, e.g., ATP, in areaction mixture after a reaction in which an NMP is added to a nucleicacid, e.g., adenylation of nucleic acid. Following sedimentation,supernatant can be removed from a reaction vessel, e.g., using apipette. Sedimented material can be washed, e.g., using a 2×PEGpptsolution (1×NEB4, 10 ug LPA, 30% PEG-8000) diluted to 1×. In some cases,sedimented material is not washed. Sedimentation can be achieved byusing magnetic beads or carboxylate beads. Sedimentation can be achievedby subjecting the reaction mixture to centrifugation and removing thesupernatant. In some embodiments, sedimentation is facilitated byincreasing the concentration of salt or concentration of Mn²⁺.

In some embodiments the donor and/or acceptor nucleic acid molecules arefully or partially denatured. Full or partial denaturation can beachieved by any means known in the art, including, e.g., heatdenaturation, incubation in basic pH, denaturation in formamide, and/orurea denaturation. Heat denaturation can be achieved by heating anucleic acid sample to about 60° C. or above, about 65° C. or above,about 70° C. or above, about 75° C. or above, about 80° C. or above,about 85° C. or above, about 90° C. or above, about 95° C. or above, orabout 100° C. or above. The nucleic acid sample can be heated by anymeans known in the art, including, e.g., incubation in a water bath, atemperature controlled heat block, or a thermal cycler.

Denaturation by incubation in basic pH can comprise incubation of thenucleic acid sample in any solution (e.g., a buffer) of pH greater thanpH7, greater than pH 8, greater than pH 9, greater than pH 10, greaterthan pH 11, greater than pH 12, greater than pH 13 or greater. In someembodiments, denaturation is achieved by incubating in a basic pH thatis close to neutral. In some embodiments, denaturation is achieved byincubating in a basic pH between about pH 7 to about pH 13, about pH 7.5to about 8, or about pH 8.5 to about pH 10. Denaturation by incubationin basic pH can be achieved by, for example, incubation of a nucleicacid sample in a solution comprising sodium hydroxide (NaOH), potassiumhydroxide (KOH), sodium bicarbonate, sodium phosphate, Tris. Thesolution can comprise about 1 mM NAOH, 2 mM NAOH, 5 mM NAOH, 10 mM NAOH,20 mM NAOH, 40 mM NAOH, 60 mM NAOH, 80 mM NAOH, 100 mM NAOH, 0.2M NaOH,about 0.3M NaOH, about 0.4M NaOH, about 0.5M NaOH, about 0.6M NaOH,about 0.7M NaOH, about 0.8M NaOH, about 0.9M NaOH, about 1.0M NaOH, orgreater than 1.0M NaOH. The solution can comprise about 1 mM KOH, 2 mMKOH, 5 mM KOH, 10 mM KOH, 20 mM KOH, 40 mM KOH, 60 mM KOH, 80 mM KOH,100 mM KOH, 0.2M KOH, 0.5M KOH, 1M KOH, or greater than 1M KOH. In someembodiments, the nucleic acid sample is incubated in NaOH or KOH forabout 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 12, 14,16, 18, 20, 25, or 30 minutes. In some embodiments, the nucleic acidsample is incubated in ammonium-acetate following NaOH or KOHincubation.

Compounds like urea and formamide contain functional groups that canform hydrogen bonds with the electronegative centers of the nucleotidebases. At high concentrations (e.g., 8M urea or 70% formamide) of thedenaturant, the competition for hydrogen bonds favors interactionsbetween the denaturant and the N-bases rather than between complementarybases, thereby separating the two strands.

Without wishing to be bound by theory, in a typical ligation method, theintermediate steps of (1) transferring a NMP to the ligase and (2)transferring the NMP to the donor nucleic acid molecule, generallyco-occur with the ligation step (3), and are reversible at neutral pH.The co-occurrence of all three steps and the reversibility of steps (1)and (2) can lead to poor ligation efficiency and poor specificity of theligation products due to several factors, such as, e.g., the possibilityof transferring NMP (e.g., adenylation, guanylylation) to both donor andacceptor species, removal of NMP from the ligase and/or donor (oracceptor) species (e.g., de-adenylation or de-guanylylation) of ligaseand/or de-adenylation or de-guanylylation of the donor (or acceptor)species before ligation can occur. However, by performing the step oftransferring NMP to the donor nucleic acid molecule and the step ofligation serially, it is possible to increase ligation efficiency byeffecting an accumulation of NMP-carrying donor nucleic acid moleculesprior to ligation to an acceptor species.

In some embodiments, reversibility of intermediate steps 1 & 2 isexploited to control the outcome of the reaction. In some embodiments,reversibility is controlled by modulating the relative concentrations ofeach component of the reaction mixture (e.g., ligase, nucleosidetriphosphate (NTP), donor, and acceptor) to promote, e.g., adenylationover de-adenylation. By way of example only, if donor and acceptornucleic acid species are present in adenylation reaction and comprisephosphorylated 5′ termini, the adenylation step becomes non-specific fordonor and acceptor species, which can lead to non-specific formation ofunwanted ligation products. However, if only the donor species ispresent for the adenylation step then adenylation can be made specificfor the donor species. In such cases, the amount of ATP and ligase alsoaffect the predominance of adenylation vs. de-adenylation. For example,self-ligation of the donor species can predominate at low concentrationsof ligase, where high concentrations of ATP (e.g., less than the amountof donor nucleic acid molecules), can lead to unwanted concatenation ofdonor species. Limiting the amount of ATP can control the extent ofconcatenation observed. Accordingly, in some embodiments, the NMPtransfer steps occur in a reaction mixture comprising an amount of donornucleic acid molecules and an amount of a ligase that is at leastequimolar to or in excess of the amount of donor nucleic acid molecules.Donor nucleic acid molecules in the reaction mixture prior to theligating step can be present in an amount of 0.1-10, 5-30, 10-50,20-100, 50-200, 100-500, 200-1000 ng/μl. Donor nucleic acid molecules inthe reaction mixture prior to the ligating step can be present in anamount to provide about 0.01 pmol, 0.05 pmol, 0.1 pmol, 0.15 pmol, 0.2pmol, 0.25 pmol, 0.5 pmol, 0.55 pmol, 0.6 pmol, 0.65 pmol, 0.7 pmol,0.75 pmol, 0.8 pmol, 0.85 pmol, 0.9 pmol, 0.95 pmol, 1 pmol, 1.1 pmol,1.2 pmol, 1.3 pmol, 1.4 pmol, 1.5 pmol, 1.6 pmol, 1.7 pmol, 1.8 pmol,1.9 pmol, 2 pmol, 5 pmol, 10 pmol, 15 pmol, 20 pmol, 25 pmol, 30 pmol,35 pmol, 40 pmol, 45 pmol, 50 pmol, 55 pmol, 60 pmol, 65 pmol, 70 pmol,75 pmol, 80 pmol, 85 pmol, 90 pmol, 95 pmol, 100 pmol, 110 pmol, 120pmol, 130 pmol, 140 pmol, 150 pmol, 160 pmol, 170 pmol, 180 pmol, 190pmol, 200 pmol, 300 pmol, 400 pmol, 500 pmol, 600 pmol, 700 pmol, 800pmol, 900 pmol, 1000 pmol (1 nmol), 2 nmol, 5 nmol, 10 nmol, or morethan 10 nmol of 5′ termini. In some embodiments, the amount of ligase isat least 1×, 1.25×, 1.5×, 2×, 3×, 4×, 5×, 7.5×, 10×, 15×, 20×, or over20× the amount of donor nucleic acid molecules. In some embodiments, theamount of ligase is 1-5×, 2-10×, 5-20× or over 20× the amount of donornucleic acid molecules. In some embodiments, the amount of ligase in thereaction mixture is about 0.01, 0.05, 0.1, 0.5 1, 1.5, 2, 4, 6, 8, 10,or more than 10 μM. In some embodiments, the adenylation steps occur ina reaction mixture comprising an amount of donor nucleic acid moleculesand an amount of ligase that is at least 0.25-fold higher, 0.5-foldhigher, 1-fold higher, 1.5-fold higher, 2-fold higher, 3-fold higher,4-fold higher, 5-fold higher, 6-fold higher, 7-fold higher, 8-foldhigher, 9-fold higher, 10-fold higher, 15-fold higher, 20-fold higher,or more than 20-fold higher than the amount of donor nucleic acidmolecules.

The ligase can be an ATP-dependent ligase. The ATP-dependent ligase canbe an RNA ligase. The RNA ligase can be, e.g., an Archaeal RNA ligase,e.g., an archaeal RNA ligase from the thermophilic archaeonMethanobacterium thermoautotrophicum (MthRnl). The RNA ligase can be anRnl 1 family ligase. Generally, Rnl 1 family ligases can repairsingle-stranded breaks in tRNA. Exemplary Rnl 1 family ligases include,e.g., T4 RNA ligase, thermostable RNA ligase 1 from Thermus scitoductusbacteriophage TS2126 (CircLigase), or CircLigase II). Such ligases canbe described in WIPO Patent Application Publication No. WO2010094040,hereby incorporated by reference. The RNA ligase can be an Rnl 2 familyligase. Generally, Rnl 2 family ligases can seal nicks in duplex RNAs.Exemplary Rnl 2 family ligases include, e.g., T4 RNA ligase 2. In someembodiments, the ATP-dependent ligase is an ATP-dependent DNA ligase.The ATP-dependent DNA ligase can be a T4 DNA ligase. These ligasesgenerally catalyze the ATP-dependent formation of a phosphodiester bondbetween a nucleotide 3′-OH nucleophile and a phosphate of a 5′ AMP•Pgroup.

In some embodiments, the ligase is a GTP-dependent ligase. TheGTP-dependent ligase can be an RNA ligase. The GTP-dependent RNA ligasecan be RtcB RNA ligase. The RtcB ligase can catalyze a GTP=dependentformation of a phosphodiester bond between a phosphate of a 3′ GMP•Pgroup and a nucleotide 5′-OH nucleophile.

In some embodiments, the reaction mixture comprises an amount of NTPsufficient to promote transfer of NMP to donor nucleic acid moleculesover removal of NMP from the donor nucleic acid molecules (e.g.,promotes adenylation or guanylylation over de-adenylation orde-guanylylation). In some embodiments, the amount of NTP is sufficientto inhibit formation of a covalent bond between adenylated donor nucleicacid molecules. In some embodiments, the adenylation steps occur in areaction mixture comprising an amount of donor nucleic acid molecules,an amount of NTP-dependent ligase, and an amount of NTP that is at least2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or10-fold higher than a Michaelis constant (Km) of the NTP-dependentligase. In some embodiments, the adenylation steps occur in a reactionmixture comprising an amount of donor nucleic acid molecules an amountof NTP-Michaelis constant (Km) dependent ligase that is at leastequimolar to or in excess of the amount of donor nucleic acid molecules,and an amount of NTP that is at least 2-fold, 3-fold, 4-fold, 5-fold,6-fold, 7-fold, 8-fold, 9-fold, or 10-fold higher than the Michaelisconstant (Km) of the NTP-dependent ligase. In particular embodiments,about 10 μM, 20 μM, 30 μM, 40 μM, 50 μM, 60 μM, 70 μM, 80 μM, 90 μM, 100μM, 200 μM, 300 μM, 400 μM, 500 μM, 600 μM, 700 μM, 800 μM, 900 μM, 1000μM of NTP is present in the reaction mixture. Such amounts of NTP mayinhibit the ligation step.

The reaction mixture in which adenylation occurs can further comprise acation. The cation can be Mg²⁺, or can be Mn²⁺. In some embodiments, thecation is Mg²⁺. The Mg²⁺ can be present in the reaction mixture at afinal concentration of 0.1 mM-1 mM, 1 mM-10 mM, 5-20 mM, 10-50 mM,30-100 mM, or more than 100 mM. The Mg²⁺ can be present in the reactionmixture at a final concentration of about 0.1 mM, 0.5 mM, 1 mM, 1.5 mM,2 mM, 2.5 mM, 3 mM, 3.5 mM, 4 mM, 4.5 mM, 5 mM, 5.5 mM, 6 mM, 6.5 mM, 7mM, 7.5 mM, 8 mM, 8.5 mM, 9 mM, 9.5 mM, or 10 mM. In some embodiments,the Mg²⁺ can be present in the reaction mixture at a final concentrationof about 1 mM to about 5 mM, about 3 mM to about 8 mM, about 4 mM toabout 10 mM. In some embodiments, the Mg²⁺ can be present in thereaction mixture at a final concentration of about 2.5 mM to about 7.5mM. In some embodiments, the Mg²⁺ can be present in the reaction mixtureat a final concentration of about 10 mM. In some embodiments, the cationis Mg²⁺. The Mg²⁺ can be present in the reaction mixture at a finalconcentration of about 0.1 mM to about 1 mM, about 1 mM to about 10 mM,about 5 to about 20 mM, about 10 to about 50 mM, about 30 to about 100mM, or more than 100 mM. In some embodiments, the cation is Mn²⁺. TheMn²⁺ can be present in the reaction mixture at a final concentration ofabout 0.1 mM, 0.5 mM, 1 mM, 1.5 mM, 2 mM, 2.5 mM, 3 mM, 3.5 mM, 4 mM,4.5 mM, 5 mM, 5.5 mM, 6 mM, 6.5 mM, 7 mM, 7.5 mM, 8 mM, 8.5 mM, 9 mM,9.5 mM, or 10 mM. In some embodiments, the Mn²⁺ can be present in thereaction mixture at a final concentration of about 1 mM to about 5 mM,about 3 mM to about 8 mM, about 4 mM to about 10 mM. In someembodiments, the Mn²⁺ can be present in the reaction mixture at a finalconcentration of about 2.5 mM to about 7.5 mM. In some embodiments, theMn²⁺ can be present in the reaction mixture at a final concentration ofabout 10 Mm. The Mn²⁺ can be present in the reaction mixture at a finalconcentration of about 0.1 mM to about 1 mM, about 1 mM to about 10 mM,about 5 to about 20 mM, about 10 to about 50 mM, about 30 to about 100mM, or more than 100 mM. In some embodiments, the cation is present inan amount sufficient to catalyze adenylation of the ligase andsubsequent adenylation of the donor nucleic acid molecules.

The reaction mixture, in which adenylation occurs can comprise pH in arange of about pH 1-pH14. In some embodiments, the reaction mixture inwhich adenylation occurs comprises a pH of at least, or about, pH 7.1,pH 7.2, pH 7.3, pH 7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH 7.9, pH 8, pH8.1, pH 8.2, pH 8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH 8.8, pH 8.9, pH9, pH 9.5, pH 10, pH 10.5, pH 11, pH 11.5, pH 12, pH 12.5, pH 13, orgreater. In some embodiments, the reaction mixture in which adenylationoccurs comprises a neutral pH (7.0). In some embodiments, the reactionmixture in which adenylation occurs comprises a pH of about pH 7.1 toabout pH 9, about pH 7.5 to about pH 9, about pH 8 to about pH 10, orabout pH 7 to about pH 8. The pH of a reaction mixture in whichadenylation occurs can be less than pH 14, 13, 12, 11, 10, 9, 8, 7, 6.5,6, 5.5, 5, 4.5, 4, 3.5, 3, 2.5, 2, 1.5, or 1. The pH of a reactionmixture in which ligation occurs can be about pH 5 to about pH 6, aboutpH 4 to about pH 5, about pH 3 to about pH 4, about pH 2 to about pH 3,or about pH 1 to about pH2.

In some embodiments the reaction mixture further comprises a highmolecular weight inert molecule, e.g., PEG of MW 4000, 6000, or 8000. Insome embodiments, the inert molecule is present in an amount that isabout 0.5%, 1%, 2%, 3%, 4%, 5%, 7.5%, 10%, 12.5%, 13%, 13.5%, 14%,14.5%, 15%, 15.5%, 16%, 16.5%, 17%, 17.5%, 18%, 18.5%, 19%, 19.5%, 20%,25%, 30%, 35%, 40%, 45%, 50%, or greater than 50% weight/volume. In someembodiments, the inert molecule is present in an amount that is about0.5-2%, about 1-5%, about 2-15%, about 10-20%, about 15-30%, about20-50%, or more than 50% weight/volume.

The NMP transfer steps described herein can effect an accumulation ofNMP-carrying donor nucleic acid molecules. The accumulation ofNMP-carrying donor nucleic acid molecules can result in at least 10%, atleast 20%, at least 30%, at least 40%, at least 50%, at least 60%, atleast 70%, at least 80%, at least 90%, at least 95%, at least 97%, atleast 98%, at least 99%, at least 99.5%, at least 99.9%, orsubstantially all of the plurality of the donor nucleic acid moleculespresent in the reaction mixture carrying an NMP.

During the NMP transfer steps, unwanted ligation products resultingfrom, e.g., donor/donor circularization or concatenation can beminimized or prevented by any means. Unwanted ligation can be minimizedor prevented, for example, by carrying out the adenylation reaction inthe presence of an amount of NTP sufficient to inhibit formation of acovalent bond (e.g., ligation) between adenylated donor nucleic acidmolecules. Exemplary amounts of NTP which may inhibit ligation aredescribed herein. Unwanted ligation can also be prevented bymodification of the 3′ terminal group of the donor nucleic acidmolecules. 3′ terminal groups of the donor nucleic acid molecules can bemodified with a 3′ terminal blocking group by any means known in theart. Generally, the 3′ terminal blocking group will prevent theformation of a covalent bond between the 3′ terminal base and anothernucleotide. In some embodiments, the 3′ terminal blocking group isdideoxy-dNTP, biotin, 3′ amino moiety, a “reversed” nucleoside base. Insome embodiments, the ligase is a T4 RNA ligase and a donor nucleic acidmolecule comprises a modified 3′ terminal group. In other embodiments,the ligase is a T4 RNA ligase and donor nucleic acid molecules compriseunmodified 3′ terminal groups. In yet other embodiments, the ligase isnot a T4 RNA ligase and donor nucleic acid molecules comprise unmodified3′ terminal groups.

In some embodiments, adenylation occurs in the reaction mixture for atime sufficient to effect accumulation of adenylated donor nucleic acidmolecules. In some embodiments, the reaction mixture is incubated forabout 1 minutes, about 2 minutes, about 3 minutes, about 4 minutes, 5minutes, about 10 minutes, about 15 minutes, about 20 minutes, about 25minutes, about 30 minutes, about 35 minutes, about 40 minutes, about 45minutes, about 50 minutes, about 55 minutes, about 60 minutes, about 70minutes, about 80 minutes, about 90 minutes, about 120 minutes, about150 minutes, about 180 minutes, about 210 minutes, about 240 minutes, ormore than 240 minutes. In some embodiments, the reaction mixture isincubated for 2-10 minutes, 5-20 minutes, 10-30 minutes, 20-60 minutes,30-90 minutes, 60-150 minutes, 120-240 minutes, or more than 240minutes.

In some embodiments the reaction mixture is incubated at a desiredtemperature to facilitate adenylation of donor nucleic acid molecules.In some embodiments the reaction mixture is heated to about 50° C.,about 51° C., about 52° C., about 53° C., about 54° C., about 55° C.,about 56° C., about 57° C., about 58° C., about 59° C., about 60° C.,about 61° C., about 62° C., about 63° C., about 64° C., about 65° C.,about 66° C., about 67° C., about 68° C., about 69° C., about 70° C., orabove 70° C. In some embodiments the reaction mixture is heated to about60-70° C. In other embodiments adenylation can occur at room temperature(e.g., 20-25° C.) or can occur at about 35-40° C. (e.g., 37° C.). Insome embodiments the reaction mixture is incubated at 0-4° C., 4-15° C.,or 10-20° C. In some embodiments the reaction mixture is incubated forabout 5 minutes, about 10 minutes, about 15 minutes, about 20 minutes,about 25 minutes, about 30 minutes, about 35 minutes, about 40 minutes,about 45 minutes, about 50 minutes, about 55 minutes, about 60 minutes,about 70 minutes, about 80 minutes, about 90 minutes, about 120 minutes,about 150 minutes, about 180 minutes, about 210 minutes, about 240minutes, or more than 240 minutes. In some embodiments, the reactionmixture is incubated for 2-10 minutes, 5-20 minutes, 10-30 minutes,20-60 minutes, 30-90 minutes, 60-150 minutes, 120-240 minutes, or morethan 240 minutes. In particular embodiments the reaction mixture isheated to 65° C. for about 60 minutes.

After accumulation of adenylated donor nucleic acid molecules, ligationof an acceptor nucleic acid molecule to an adenylated donor nucleic acidmolecule can be effected without separating (e.g., purifying) theadenylated donor nucleic acid molecules from the reaction mixture. Insome embodiments ligation is effected by further adding to the reactionmixture liquid in an amount sufficient to dilute NTP. In someembodiments NTP is diluted 2-fold, 3-fold, 4-fold, 5-fold, 6-fold,7-fold, 8-fold, 9-fold, 10-fold, 12-fold, 15-fold, 20-fold, 50-fold,100-fold, or more than 100-fold. The liquid can comprise water, buffer,monovalent ion, cation, a high molecular weight inert molecule, or anycombination thereof. For example, further amounts of buffer, monovalention, cation, high molecular weight inert molecule, or any combinationthereof, can be added to the reaction mixture in order to preserve theoriginal concentration of these reaction mixture components upondilution of NTP. The dilution of NTP can release NTP-mediated inhibitionof the ligase, thereby allowing the ligation step to proceed. In someembodiments, ligation is effected by further adding to the reactionmixture a cation. The cation can be Mg²⁺, or can be Mn²⁺. In someembodiments the cation is Mn²⁺. In some embodiments the cationfacilitates the ligation step. In some embodiments Mn²⁺ is present inthe reaction mixture at a final concentration of 0 mM-2 mM, 1 mM-2.5 mM,2.5 mM-5 mM, 5 mM-7.5 mM, or greater than 7.5 mM. In some embodimentsMn²⁺ is present in the reaction mixture at a final concentration of 2.5mM, 3 mM, 3.5 mM, 4 mM, 4.5 mM, 5 mM, 5.5 mM, 6 mM, 6.5 mM, 7 mM, 7.5mM, or more than 7.5 mM. In some embodiments Mn²⁺ is present in thereaction mixture at a final concentration of about 5 mM. In someembodiments Mn²⁺ is present in the reaction mixture at a finalconcentration of about 2.5 mM to about 7.5 mM. In some embodiments themethod further comprises adding to the reaction mixture an amount ofacceptor nucleic acid molecules. In some embodiments the acceptornucleic acid molecules are added in an amount that is excess as comparedto the amount of donor nucleic acid molecules. For example, the acceptornucleic acid molecules can be added in an amount that is 1.5×-10×,2×-50×, 5×-100×, 50×-500×, or more than 500× the amount of donor nucleicacid molecules in the reaction mixture. In other embodiments theacceptor nucleic acid molecules are added in an amount such that theamount of donor nucleic acid molecules are in excess as compared to theamount of acceptor nucleic acid molecules. For example, the donornucleic acid molecules can be present in an amount that is 1.5×-10×,2×-50×, 5×-100×, 50×-500×, or more than 500× the amount of acceptornucleic acid molecules in the reaction mixture. In some embodiments,additional amounts of ligase can be added to the reaction mixture. Insome embodiments, no additional ligase is added to the reaction mixture.

In some embodiments, the reaction mixture is incubated for a timesufficient to effect ligation of the NMP-carrying donor nucleic acidmolecules to the acceptor nucleic acid molecules. In some embodiments,the reaction mixture is incubated for about 5 minutes, about 10 minutes,about 15 minutes, about 20 minutes, about 25 minutes, about 30 minutes,about 35 minutes, about 40 minutes, about 45 minutes, about 50 minutes,about 55 minutes, about 60 minutes, about 70 minutes, about 80 minutes,about 90 minutes, about 120 minutes, about 150 minutes, about 180minutes, about 210 minutes, about 240 minutes, or more than 240 minutes.In some embodiments, the reaction mixture is incubated for 2-10 minutes,5-20 minutes, 10-30 minutes, 20-60 minutes, 30-90 minutes, 60-150minutes, 120-240 minutes, or more than 240 minutes.

In some embodiments the reaction mixture is incubated at a desiredtemperature to facilitate ligation. In some embodiments the reactionmixture is heated to about 50° C., about 51° C., about 52° C., about 53°C., about 54° C., about 55° C., about 56° C., about 57° C., about 58°C., about 59° C., about 60° C., about 61° C., about 62° C., about 63°C., about 64° C., about 65° C., about 66° C., about 67° C., about 68°C., about 69° C., about 70° C., or above 70° C. In some embodiments thereaction mixture is heated to about 60-70° C. In other embodimentsligation can occur at cold temperatures (e.g., about 0-4° C., about 4°C., about 4-15° C., about 12° C., or about 10-20° C.), at roomtemperature (e.g., 20-25° C.) or can occur at about 35-40° C. (e.g., 37°C.). In some embodiments the reaction mixture is incubated at thedesired temperature for about 5 minutes, about 10 minutes, about 15minutes, about 20 minutes, about 25 minutes, about 30 minutes, about 35minutes, about 40 minutes, about 45 minutes, about 50 minutes, about 55minutes, about 60 minutes, about 70 minutes, about 80 minutes, about 90minutes, about 120 minutes, about 150 minutes, about 180 minutes, about210 minutes, about 240 minutes, or more than 240 minutes. In someembodiments, the reaction mixture is incubated at the desiredtemperature for 2-10 minutes, 5-20 minutes, 10-30 minutes, 20-60minutes, 30-90 minutes, 60-150 minutes, 120-240 minutes, or more than240 minutes. In particular embodiments the reaction mixture is heated to65° C. for about 60 minutes.

Following incubation, the method can further comprise inactivating theligase by any means known in the art. Inactivation of the ligase can beeffected by heat-inactivation. For example, the reaction mixture can beheated to 65, 70, 75, 80, 85, 90, 95, or more than 95° C. for 1, 2, 3,4, 5, 6, 7, 8, 9, 10, or more than 10 minutes. In particularembodiments, the reaction mixture is heated to 80° C. for 10 minutes,followed by 95° C. for 3 minutes. Inactivation of the ligase can also beeffected by, e.g., incubation with EDTA, incubation with formamide,incubation with urea, or incubation with protease.

Following inactivation of the ligase, the desired ligation products canbe purified or separated from the reaction mixture by any means known inthe art. For example, proteins of the reaction mixture can be removed,for example, by treating the reaction mixture with a protease. Proteasetreatment can involve incubating the reaction mixture with a proteasefor about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60minutes, or over 60 minutes at 20-25° C., 35-40° C. (e.g., 37° C.), ormore than 40° C. The protease can then be inactivated, e.g., byincubating for 10-20 minutes at 75° C. The desired reaction products canbe further purified, for example, by precipitation, by columnpurification, by centrifugation, or any other method known in the art.

An exemplary embodiment of a method for high-efficiency ligation isdepicted in FIG. 10. In a first step (optional), double-stranded DNAfragments (e.g., donor) are partially denatured and treated with T4polynucleotide kinase. The T4 polynucleotide kinase catalyzes theaddition of phosphate groups to the 5′ termini of donor nucleic acidmolecules and removal of phosphate groups from the 3′ termini of donornucleic acid molecules. The donor may or may not be purified at thispoint. In a next step, the donor molecules are added to a reactionmixture comprising excess ATP-dependent RNA ligase, excess ATP, andMg²⁺. The ligase catalyzes transfer of an adenylyl monophosphate to the5′ phosphate of the donor molecules, releasing PPi. The reaction mixtureis incubated under conditions sufficient to effect an accumulation ofadenylated donor nucleic acid molecules. In a next step followingadenylation, liquid is added to the reaction mixture to dilute ATP atleast 10-fold. In some embodiments, the adenylated donor molecules arefirst sedimented by centrifugation for 1, 2, 5, 10, 20, 30 minat >1,000, >2,000, >22,000×g, and the supernatant removed prior todilution. The liquid may comprise further components, including but notlimited to water, monovalent salts, Mg²⁺, PEG. Also added to thereaction mixture are nucleic acid molecules to be ligated to the donormolecules (e.g., acceptor) and Mn²⁺. The acceptor nucleic acids may ormay not comprise a detectable tag (e.g., biotin). The detectable tag maybe used for detecting and/or affinity binding. Both the dilution of ATPand addition of Mn²⁺ drive the ligation reaction to completion,resulting in ligation products comprising acceptor-donor molecules.

Another exemplary embodiment of a method for high-efficiency ligation isdepicted in FIG. 11. In a first step (optional), double-stranded DNAfragments (e.g., donor) are partially denatured and treated with anenzyme that catalyzes the addition of phosphate groups to the 3′adenylation of at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90%termini of donor nucleic acid molecules and removal of phosphate groupsfrom the 5′ termini of donor nucleic acid molecules. The donor may ormay not be purified at this point. In a next step, the donor moleculesare added to a reaction mixture comprising excess GTP-dependent RNAligase (e.g., RtcB), excess GTP, and Mn²⁺. The ligase catalyzes transferof an guanylyl monophosphate to the 3′ phosphate of the donor molecules,releasing PPi. The reaction mixture is incubated under conditionssufficient to effect an accumulation of guanylylated donor nucleic acidmolecules. In a next step following adenylation, liquid is added to thereaction mixture to dilute GTP at least 10-fold. The liquid may comprisefurther components, including but not limited to water, monovalentsalts, Mn²⁺, PEG. Also added to the reaction mixture are nucleic acidmolecules to be ligated to the donor molecules (e.g., acceptor) andMn²⁺. In some embodiments, the Mn²⁺ is present in an amount that is atleast 2.5 mM. In some embodiments, the Mn²⁺ is present in an amount thatis about 5 mM. In some embodiments, the Mn²⁺ is present in an amountthat is about 2.5 mM to about 7 mM. The acceptor nucleic acids may ormay not comprise a detectable tag (e.g., biotin). The detectable tag maybe used for detecting and/or affinity binding. Both the dilution of GTPand addition of Mn²⁺ drive the ligation reaction to completion,resulting in ligation products comprising acceptor-donor molecules.

Exemplary Applications

The high-efficiency ligation methods are useful for a wide range ofapplications. For example, the high efficiency ligation methods areuseful for any applications in which tagging of nucleic acids with adetectable tag or an affinity tag is desired. For other example, thehigh efficiency ligation methods are useful for any applications inwhich linking of one nucleic acid species to another nucleic acidspecies is desired. The high efficiency ligation methods are also usefulfor the preparation of nucleic acid libraries for analysis, e.g., foranalysis by sequencing, by array hybridization assays, includingcomparative genome hybridization (CGH) assays. Such high efficiencypreparation methods confer many advantages to downstream analysis, forexample, by allowing for the direct analysis of a starting sample ofnucleic acids without significant loss of starting material, by allowingfor direct analysis of nucleic acids without requiringpre-amplification, by allowing for analysis of nucleic acids withoutintroducing labeling or amplification bias which can be associated withpre-amplification, and lowering potential bioinformatic load. Such highefficiency ligation methods and kits may also be useful for, e.g.,molecular cloning purposes, or for barcoding applications.

Sequencing Applications/High Efficiency Library Preparation

The high efficiency ligation methods and kits as described herein can beapplied to the preparation of nucleic acid libraries for sequencing.Such preparation methods enable digital sequencing of the nucleic acidswithout significant loss of starting material, particularly forsequencing utilizing emulsion based sequencing platforms. Suchpreparation methods can also enable detection of DNA methylation withoutthe use of bisulfite treatment. An exemplary method of DNA methylationdetection is described in Flusberg et. al., Nature Methods 2010 June:7(6):461-465, which is hereby incorporated by reference. Accordingly,further aspects of the disclosure relate to methods, kits, and systemsfor high-efficiency nucleic acid library preparation. The nucleic acidlibrary can be used for sequencing by a sequencing platform. Thesequencing platform can be a next-generation sequencing (NGS) platform.In some embodiments, the method further comprises sequencing the nucleicacid library using NGS technology. Exemplary NGS technologies andsequencing platforms are described herein.

In one aspect, the disclosure provides methods of preparing a nucleicacid library from a plurality of template nucleic acids isolated from abiological source. The plurality of template nucleic acids can comprisegenomic material. The genomic material can comprise genomic DNA (gDNA),RNA, or cDNA reverse-transcribed from RNA. The nucleic acid library canbe a DNA library, an RNA library, a single-stranded DNA library, or adouble-stranded DNA library. In some embodiments, the method comprisesligation of adaptor sequences to template nucleic acids. In someembodiments, the method improves efficiency of adaptor ligation by over10-fold, 50-fold, 100-fold, 500-fold, 1000-fold, or more than 1000-fold.The methods described herein can, for example, increase adaptor ligationefficiency to over 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%,97%, 98%, 99%, 99.5%, or 99.9% efficiency. In some embodiments, themethods results in correct ligation of adaptors to over 80%, over 85%,over 90%, over 95%, over 97%, over 98%, over 99%, over 99.5%, over99.9%, or substantially all of the plurality of template nucleic acids.Such highly efficient ligation methods as described herein can enablethe preparation of nucleic acid libraries that accurately representsubstantially all of the desired nucleic acids (e.g., gDNA, RNA, orcDNA) isolated from the biological source. Furthermore, the methodsdescribed herein can obviate the necessity of library pre-amplification,and avoid the introduction of pre-amplification bias and sequencingerrors resulting from pre-amplification. Such methods can pave the wayfor digital sequencing capabilities, e.g., the capability to provide adigital readout of sequence reads for each individual template nucleicacid isolated from a biological source, and can improve the sensitivityfor detection of rare mutations (e.g., rare single nucleotidepolymorphisms (SNPs) or rare copy number variants). Accordingly, in someaspects the disclosure provides a method of sequencing a plurality ofnucleic acids isolated from a biological source, comprising ligatingsequencing adaptors to at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,90%, 95%, 99%, or substantially all of the plurality of nucleic acids,thereby creating a nucleic acid library, and sequencing the nucleic acidlibrary without pre-amplification of the library.

In some embodiments, the method comprises ligating an adaptor sequenceto a first end of at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or90% of a plurality of template nucleic acids, thereby creating a nucleicacid library. An adaptor sequence can comprise a defined oligonucleotidesequence that affects coupling of a library member to a sequencingplatform. By way of example only, the adaptor can comprise a sequencethat is at least 70% complementary or identical to an oligonucleotidesequence immobilized onto a solid support (e.g., a sequencing flow cellor bead). An adaptor sequence can comprise a defined oligonucleotidesequence that is at least 70% complementary or identical to a sequencingprimer. The sequencing primer can enable nucleotide incorporation by apolymerase, wherein incorporation of the nucleotide is monitored toprovide sequencing information. In some embodiments, an adaptorcomprises a sequence that is at least 70% complementary or identical toan oligonucleotide sequence immobilized onto a solid support and asequence that is at least 70% complementary or identical to a sequencingprimer. In some embodiments, the adaptor can comprise a barcodesequence. In some embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%,70%, 80%, 90%, or 100% of sequencing library members in a librarycomprise the same adaptor sequence. In some embodiments, at least 10%,20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% of sequencing librarymembers comprise an adaptor sequence at a first end but not at a secondend. In some embodiments, the first end is a 5′ end. In someembodiments, the first end is at 3′ end. In some embodiments, at least10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% of sequencinglibrary members comprise an adaptor sequence at a first and at a secondend. The adaptor sequence at the first end may be distinct from theadaptor sequence at the second end. The adaptor sequence can be chosenby a user according to the sequencing platform used for sequencing. Insome embodiments, the method of ligating an adaptor to a first end of anucleic acid comprises a high efficiency ligation method as describedherein.

In some embodiments, following ligation of a first adaptor at a firstend of a template nucleic acid, ligation of a second adaptor at a secondend of the template nucleic acid is performed using any of the methodsas described herein. By way of example only, an Illumina sequencing bysynthesis platform comprises a solid support with a first and secondpopulation of surface-bound oligonucleotides immobilized thereon. Sucholigonucleotides comprise a sequence for hybridizing to a first andsecond Illumina-specific adaptor oligonucleotide and priming anextension reaction. Accordingly, in some embodiments the library membercomprises a first Illumina-specific adaptor that is partially or whollycomplementary to a first population of surface bound oligonucleotides ofan Illumina system. The library member may further comprise a secondIllumina-specific adaptor that is partially or wholly complementary to asecond population of surface bound oligonucleotides of an Illuminasystem. By way of other example only, the SOLiD system, and Ion Torrent,GS FLEX system comprises a solid support in the form of a bead withsurface bound oligonucleotides immobilized thereon. Accordingly, in someembodiments the nucleic acid library member comprises an adaptorsequence that is complementary to a surface-bound oligonucleotide of aSOLiD system, Ion Torrent system, or GS Flex system.

The plurality of template nucleic acids can comprise a template nucleicacid that is over 120 nt long. The plurality of template nucleic acidscan have an average length of >120 nt. The plurality of template nucleicacids can have an average length of 50-100, 75-125, 120-150, 130-170,150-250, 200-500, 300-700, 500-1000, 800-2000, 1500-5000, 4000-10000, orover 10000 nt. The plurality of template nucleic acids can comprisegenomic DNA. The plurality of template nucleic acids can comprisesingle-stranded (ss) nucleic acid fragments, such as, e.g., ssDNA. Insome embodiments, the method can result in ligation of an adaptorsequence to a first end of at least 95%, 96%, 97%, 98%, 99%, 99.5%, orgreater than 99.5% of the plurality of template nucleic acids.

FIG. 12 depicts an exemplary workflow for preparing a nucleic acidlibrary. In a first step 1210, nucleic acids are obtained from abiological source. The biological source can be a subject. Exemplarybiological sources and subjects are described herein. In a second step1220, adaptors are ligated to 90% of the obtained nucleic acids usingany of the methods described herein. In a third step 1230 (optional),the library may be sequenced, or may be adaptor-ligated to a secondadaptor using any of the methods as described herein, or undergotarget-selective library preparation. Target-selective librarypreparation may be by any means known in the art. Exemplarytarget-selective library preparation methods are described in, e.g.,U.S. Pat. Nos. 6,063,604; 6,090,591; 8,349,563; US Patent ApplicationPub. Nos. 2009010508, 20110244455 2012003657, 20120157322, 20130045872,and PCT Publication No. WO2012103154, all of which are herebyincorporated by reference. In some embodiments, the library is subjectedto a method for preparing a target-enriched nucleic acid library asdescribed herein.

FIG. 13A depicts an exemplary embodiment of a method for preparing anucleic acid library, comprising ligating a first adaptor to a 5′ end ofnucleic acid fragments. In a first step 1310 a plurality of templatenucleic acid fragments (e.g., DNA fragments) comprising a 5′ phosphateis incubated in a reaction mixture containing an excess amount of ligaseand excess ATP. The template DNA fragments may be fully or partiallydenatured. The ligase catalyzes transfer of AMP to the 5′ phosphate ofthe template nucleic acid fragments (e.g., adenylates the template DNAfragments), releasing PPi in the process. The reaction is incubatedunder conditions sufficient to result in adenylation of at least 10%,20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the template nucleic acidfragments. In a next step 1320, liquid is added to the reaction mixturein an amount sufficient to dilute ATP at least 10-fold. The liquid maycomprise components such as, e.g., water, monovalent salts, Mg²⁺, PEG.Also added to the reaction mixture are the adaptor oligonucleotides tobe ligated to the donor molecules (e.g., Adaptor 1) and Mn²⁺. Theadaptor oligonucleotides may or may not comprise a detectable tag. Thedetectable tag may be used for detecting and/or affinity binding. Theadaptor oligonucleotides may comprise 3′ OH groups. Both the dilution ofATP and addition of Mn²⁺ may drive the ligation reaction to completion,resulting in ligation products comprising, in the 5′-3′ direction,Adaptor1-template nucleic acid. The ligation products may then becollected and optionally further processed in step 1330 by sequencing,by ligation of a second adaptor sequence to a 3′ end (as described in,e.g., FIG. 14A), followed by sequencing, or by target-selective librarypreparation as described herein. In some embodiments, the library issubjected to a method for preparing a target-enriched nucleic acidlibrary as described herein.

FIG. 13B depicts another exemplary embodiment of a method for preparinga nucleic acid library, comprising ligating a first adaptor to a 3′ endof nucleic acid fragments. In a first step 1350 a plurality ofoligonucleotide adaptors (e.g., Adaptor) comprising a 5′ phosphate isincubated in a reaction mixture containing an excess amount of ligaseand excess ATP. The Adaptor oligonucleotides may be fully or partiallydenatured. The Adaptor oligonucleotides may or may not comprise adetectable tag. The detectable tag may be used for detecting and/oraffinity binding. The ligase catalyzes transfer of AMP to the 5′phosphate of the Adaptor 1 oligonucleotides (e.g., adenylates Adaptor1), releasing PPi in the process. The reaction is incubated underconditions sufficient to result in adenylation of at least 90% ofAdaptor. In a next step 1360, liquid is added to the reaction mixture inan amount sufficient to dilute ATP at least 10-fold. The liquid maycomprise components such as, e.g., water, monovalent salts, Mg²⁺, PEG.Also added to the reaction mixture are the sample of template nucleicacids (e.g., template) and Mn²⁺. The template nucleic acids may comprise3′ OH groups. Both the dilution of ATP and addition of Mn²⁺ drive theligation reaction to completion, resulting in ligation productscomprising, in the 5′-3′ direction, template DNA-Adaptor. The ligationproducts may then be collected and optionally further processed bysequencing, by ligation of a second adaptor sequence to a 3′ endfollowed by sequencing, or by target-selective library preparation asdescribed herein. Both the dilution of ATP and addition of Mn²⁺ maydrive the ligation reaction to completion, resulting in ligationproducts comprising, in the 5′-3′ direction, Template nucleicacid-Adaptor. The ligation products may then be collected and optionallyfurther processed in step 1370 by sequencing, by ligation of a secondadaptor sequence to a 5′ end as described in FIG. 14B, followed bysequencing, or by target-selective library preparation as describedherein. In some embodiments, the library is subjected to a method forpreparing a target-enriched nucleic acid library as described herein.

FIG. 14A depicts an exemplary embodiment of a method for ligating asecond adaptor sequence to Adaptor1-template nucleic acid moleculesprepared as described in FIG. 13A. In a first step 1410, a plurality ofoligonucleotides comprising a second adaptor sequence (“Adaptor 2”)comprising a 5′ phosphate is incubated in a reaction mixture containingan excess amount of ligase and excess ATP. The oligonucleotides may befully or partially denatured. The ligase catalyzes transfer of AMP tothe 5′ phosphate of the oligonucleotides (e.g., adenylates the Adaptor 2oligonucleotides), releasing PPi in the process. The reaction isincubated under conditions sufficient to result in adenylation of atleast 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the Adaptor 2oligonucleotides. In a next step 1420, liquid is added to the reactionmixture in an amount sufficient to dilute ATP at least 10-fold. Theliquid may comprise components such as, e.g., water, monovalent salts,Mg²⁺, PEG. Also added to the reaction mixture are the Adaptor1-templatenucleic acid molecules (e.g., as described in FIG. 4A) and Mn²⁺. TheAdaptor1-template nucleic acid molecules may comprise 3′ OH groups. Boththe dilution of ATP and addition of Mn²⁺ drive the ligation reaction tocompletion, resulting in ligation products comprising Adaptor1-templatenucleic acid-Adaptor 2 library members. The ligation products mayoptionally be sequenced.

FIG. 14B depicts an exemplary embodiment of a method for ligating asecond adaptor sequence to template nucleic acid-Adaptor 1 moleculesprepared as described in FIG. 13B. In a first step 1450, thetemplate-Adaptor 1 molecules comprising a 5′ phosphate is incubated in areaction mixture containing an excess amount of ligase and excess ATP.The template-Adaptor 1 molecules may be fully or partially denatured.The ligase catalyzes transfer of AMP to the 5′ phosphate of thetemplate-Adaptor 1 molecules (e.g., adenylates the template-Adaptor 1molecules), releasing PPi in the process. The reaction is incubatedunder conditions sufficient to result in adenylation of at least 10%,20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the template-Adaptor 1molecules. In a next step 1460, liquid is added to the reaction mixturein an amount sufficient to dilute ATP at least 10-fold. The liquid maycomprise components such as, e.g., water, monovalent salts, Mg²⁺, PEG.Also added to the reaction mixture are Adaptor 2 oligonucleotidescomprising a second adaptor sequence and Mn²⁺. The Adaptor 2oligonucleotides may comprise 3′ OH groups. Both the dilution of ATP andaddition of Mn²⁺ drive the ligation reaction to completion, resulting inligation products comprising Adaptor2-template—Adaptor 1 librarymembers. The library members may also be constructed asAdaptor1-template-Adaptor 2 using the methods as described herein. Theligation products may optionally be sequenced.

Target-Enriched Library Preparation

In another aspect, the disclosure provides a method for preparing atarget-enriched DNA library. The method can involve hybridizing atarget-selective oligonucleotide to a sequencing library member tocreate a hybridization product. The method can further compriseamplifying the hybridization product in a single round of amplificationto create an extension strand.

The method of target enrichment can be as described in US. PatentApplication Pub. No. 20120157322, hereby incorporated by reference.

The hybridizing and amplifying can occur in a reaction mixture. Themixture may comprise nucleotides (dNTPs), a polymerase and atarget-selective oligonucleotide. In some embodiments, the mixturecomprises a plurality of target-selective oligonucleotides. The mixturecan comprise, for example, 1-10, 5-20, 10-50, 40-100, 80-200, 150-500,300-1000, 800-2000, 1000-5000, 4000-10000, 8000-20000, or more than20000 target-selective oligonucleotides. The mixture may furthercomprise a Tris buffer, a monovalent salt, and Mg²⁺. The concentrationof each component can be optimized by an ordinary skilled artisan. Thereaction mixture can also comprise additives including, but not limitedto, non-specific background/blocking nucleic acids (e.g., salmon spermDNA), biopreservatives (e.g. sodium azide), PCR enhancers (e.g. Betaine,Trehalose, etc.), and inhibitors (e.g. RNAse inhibitors). In someembodiments, a nucleic acid sample (e.g., a sample comprising a librarymember) is admixed with the reaction mixture.

The library member can be fully or partially denatured. The librarymember can comprise a first single-stranded adaptor sequence located ata first end but not at a second end. In some embodiments, the first endis a 5′ end. In some embodiments, the library member comprising a firstadaptor sequence at a 5′ end is prepared as described in FIG. 13A. Inother embodiments, the library member comprising a first adaptorsequence is prepared as described by ligating a reverse complementadaptor sequence to a 3′ end of a nucleic acid (e.g., a gDNA fragment)as described in FIG. 13B, followed by linear amplification of theresulting ligation product using a primer comprising a full adaptorsequence and hybridizable to the reverse complement. In someembodiments, the target-selective oligonucleotide comprises a secondsingle-stranded adaptor sequence located at a first end but not a secondend. The first end of the target-selective oligonucleotide can be a 5′end. In some embodiments, the first adaptor sequence comprises asequence that is at least 70% identical to a first surface-boundoligonucleotide. In some embodiments, the first adaptor sequencecomprises a sequence that is at least 70% identical to a sequencingprimer. In some embodiments the first adaptor further comprises abarcode sequence. In some embodiments, the second adaptor comprises asequence that is at least 70% identical to a second surface-boundoligonucleotide. In some embodiments, the second adaptor comprises asequence that is at least 70% identical to a sequencing primer.

The target-selective oligonucleotide can be designed to at leastpartially hybridize to a target polynucleotide of interest. In someembodiments, the target-selective oligonucleotide is designed toselectively hybridize to the target polynucleotide. The target-selectiveoligonucleotide can be at least about 70%, 75%, 80%, 85%, 90%, 95%, ormore than 95% complementary to a sequence in the target polynucleotide.In some embodiments, the target-selective oligonucleotide is 100%complementary to a sequence in the target polynucleotide. Thehybridization can result in a target-selective oligonucleotide/targetduplex with a Tm. The Tm of the target-selective oligonucleotide/targetduplex can be between 0-100° C., between 20-90° C., between 40-80° C.,between 50-70° C., or between 55-65° C. The target-selectiveoligonucleotide can be sufficiently long to prime the synthesis ofextension products in the presence of a polymerase. The exact length andcomposition of a target-selective oligonucleotide can depend on manyfactors, including temperature of the annealing reaction, source andcomposition of the primer, and ratio of primer:probe concentration. Thetarget-selective oligonucleotide can be, for example, 8-50, 10-40, or12-24 nucleotides in length.

The method can comprise extension of the target in the reaction mixture.The extension can be primed by a target-selective oligonucleotide in atarget-selective oligonucleotide/target duplex. In some embodimentsextension is carried out utilizing a nucleic acid polymerase. Thenucleic acid polymerase can be a DNA polymerase. In particularembodiments, the DNA polymerase is a thermostable DNA polymerase. Thepolymerase can be a member of B family DNA proofreading polymerases(Vent, Pfu, Phusion, and their variants), a DNA polymerase holoenzyme(DNA pol III holoenzyme), a Taq polymerase, or a combination thereof.

Extension can be carried out as an automated process wherein thereaction mixture comprising template DNA is cycled through a denaturingstep, an annealing step, and a synthesis step. The automated process maybe carried out using a PCR thermal cycler. Commercially availablethermal cycler systems include systems from Bio-Rad Laboratories, Lifetechnologies, Perkin-Elmer, among others. In some embodiments, one cycleof amplification is performed.

Extension of the target-selective oligonucleotide/target duplex canresult in a double stranded extension product comprising (1) theoriginal ssDNA fragment comprising the target sequence, and (2) anextended strand comprising the second adaptor sequence, thetarget-selective oligonucleotide, a reverse complement of the targetsequence, and a reverse complement of the first adaptor sequence. If thefirst adaptor sequence of the original ssDNA fragment was 70% or moreidentical to a first surface-bound oligonucleotide, then the extendedstrand would comprise a first adaptor sequence that is 70% or morecomplementary to the first surface-bound oligonucleotide, and therebywould be hybridizable to the first surface-bound oligonucleotide. Theextended strands, can comprise the target-enriched library, wherein eachlibrary member comprises a first adaptor at a first end and a secondadaptor at a second end.

The target-enriched library can be sequenced. The target-enrichedlibrary members in can be denatured. The denatured library members canbe contacted with a surface immobilized thereon at least a firstsurface-bound oligonucleotide. In some embodiments, the extended strandis captured by the first surface-bound oligonucleotide, which can annealto the first adaptor sequence on the extended strand.

The first surface-bound oligonucleotide can prime the extension of thecaptured extended strand. In some embodiments, extension of the capturedextended strand results in a captured extension product. The capturedextension product can comprise the first surface bound oligonucleotide,the target sequence, and a second adaptor sequence that is at least 70%or more complementary to a second surface-bound oligonucleotide.

In some embodiments, the captured extension product hybridizes to thesecond surface-bound oligonucleotide, forming a bridge. In someembodiments, the bridge is amplified by bridge PCR. Bridge PCR methodscan be carried out using methods known to the art. A person skilled inthe art will appreciate that the methods described herein can be adaptedto any solid-phase amplification method, such as amplification on abead.

Variations in Methodologies for Library Preparation e.g., from GenomicDNA

Further embodiments of the disclosure relate to variations inmethodologies for preparing nucleic acid libraries for sequencing (e.g.,NGS), which can, e.g., improve target enrichment. In some embodiments ofa library preparation method, genomic DNA (gDNA) is fragmented to aplurality of fragments of a desired range of lengths for a desiredsequencing platform, damaged bases, nucleotides and/or abasic sites areremoved or optionally replaced, and ends are optionally polished, asdescribed herein. Phosphate groups can be removed from the dsDNAfragments, e.g., as described herein. In some embodiments, the methodfurther comprises ligating a first adaptor sequence (e.g., an NGSadaptor that optionally contains a sample-identifying barcode [index]sequence) to the 3′-end of at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%,8%, 9%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100%of the DNA fragments that have been denatured partially or wholly tocreate a plurality of ssDNA fragments, e.g., as described herein. Thefirst adaptor sequence at the 3′-end optionally can contain a moietycapable of binding to an immobilized capturing reagent, or can beattached to a solid support (e.g., beads, e.g., magnetic beads, or aflow cell). For example, the first adaptor sequence at the 3′-end can beattached to biotin so that biotinylated fragments can be captured by asolid support (e.g., beads, e.g., magnetic beads, resin or column)containing streptavidin or avidin. The 5′-end of the DNA fragments (atthe double-stranded or single-stranded stage) can optionally be capped(e.g., as described in further detail below). Any DNA fragments notligated at the 3′-end to an adaptor can optionally be removed bycapturing biotinylated fragments with a streptavidin/avidin solidsupport and washing away unligated fragments, or by washing awayunligated fragments if the first adaptor at the 3′-end of the DNAfragments is directly attached to a solid support. An extension primercan be added to the ssDNA fragments containing a sequence that iscomplementary to at least a portion of the first adaptor sequence on the3′-end of the fragments. The extension primer can be extended. If thefirst adaptor sequence at the 3′-end contains a moiety (e.g., biotin)that is bound to an immobilized capturing reagent (e.g., astreptavidin/avidin solid support) or is directly attached to a solidsupport, the reactants from the extension reaction can be washed way.The double-stranded products of the extension reaction can be denatured,and a plurality of single-stranded extension products comprising at the5′-end a sequence complementary to at least a portion of the firstadaptor sequence can be collected (e.g., by removal from a solidsupport). In some embodiments, the method further comprises: (i)hybridizing a target-selective oligonucleotide (TSO) to at least onemember of the plurality of single-stranded extension products, whereinthe TSO comprises a sequence complementary to at least a portion of atarget DNA sequence of interest and a second adaptor sequence at the5′-end of the TSO, wherein the second adaptor sequence is different fromthe first adaptor sequence and optionally contains a strand-identifyingbarcode (index) sequence; and (ii) extending the hybridized TSO, andoptionally performing linear amplification for an appropriate number ofcycles (e.g., about 40 cycles), e.g., as described herein to produceamplification products comprising the second adaptor sequence, asequence identical to at least a portion of the target DNA sequence, anda sequence identical to at least a portion of the first adaptorsequence. In certain embodiments, the TSO comprises a sequence having atleast about 50%, 60%, 70%, 80%, 90% or 95% identity or complementarityto a region of a gene, e.g., a cancer-related gene. A plurality of TSOstargeting the same DNA sequence of interest, or a plurality of TSOstargeting a plurality of different DNA sequences of interest, can beused.

In some embodiments of another library preparation method, genomic DNAis fragmented to a plurality of fragments of a desired range of lengthsfor a desired sequencing platform, damaged nucleotides, bases, and/orabasic sites are removed or replaced and ends are optionally polished,as described herein. All phosphate groups are removed from the dsDNAfragments, and the dsDNA fragments are denatured into ssDNA fragments,as described herein. In some embodiments, the dsDNA fragments are notdenatured into ssDNA fragments prior to library formation.

In some embodiments, a method (see, e.g., FIG. 60) comprises ligating afirst oligonucleotide comprising a first adaptor sequence (e.g., asequence complementary at least partially to a NGS adaptor sequence thatoptionally contains a sample-identifying barcode) to the 3′-end of atleast, or about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 30%, 50%, 70%,90% or 95% of the plurality of nucleic acid fragments (e.g., RNAfragments, ssDNA fragments, dsDNA fragments) to generate a plurality ofmodified nucleic acid fragments (e.g., RNA fragments, ssDNA fragments,dsDNA fragments) (6050). The plurality of nucleic acid fragments (e.g.,RNA fragments, ssDNA fragments, dsDNA fragments) can be a whole genomeor transcriptome; the plurality of nucleic acid fragments (e.g., RNAfragments, ssDNA fragments, dsDNA fragments) can be from a single cellor from a single organism. The first oligonucleotide can comprise RNAand/or DNA. The first oligonucleotide can be single-stranded,double-stranded, or partially double-stranded. The first oligonucleotidecan be, e.g., a single-stranded RNA or DNA adaptor. The nucleic acidfragments (e.g., RNA fragments, ssDNA fragments, dsDNA fragments) can bemodified as described herein, e.g., modified at the 5′ end. The adaptorcan be an indexed Illumina P7 adaptor. The first oligonucleotide can beof a length of about 10 nts to about 150 nts, a length of about 15 ntsto about 80 nts, a length of about 19 to about 25 nts, or a length ofabout 19 nts. The first oligonucleotide can optionally contain a moiety(e.g., biotin) capable of binding to an immobilized capturing reagent(e.g., a streptavidin/avidin solid support (e.g., beads, resin orcolumn)), or can be attached to a solid support (e.g., beads (e.g.,magnetic beads) or a flow cell). The 5′-end of the DNA fragments canoptionally be capped (described in further detail herein). The ligationcan comprise transferring an NMP (e.g., AMP) to a 5′ end of the firstoligonucleotide, diluting a reaction mixture to dilute the ATP in thereaction mixture, add a cation (e.g., Mn²⁺), and ligating a 5′ end ofthe first oligonucleotide to a 3′ end of a template nucleic acid.

Any nucleic acid fragments not ligated at the 3′-end to the firstoligonucleotide can optionally be removed by capturing, e.g.,biotinylated fragments onto a streptavidin/avidin solid support andwashing away unligated fragments.

In some embodiments, the method comprises: (a) ligating a firstsingle-stranded adaptor to a 3′ end of a single-stranded nucleic acidtemplate to generate a single-stranded template ligated to a firstsingle-stranded adaptor, (b) annealing a primer to the single-strandedadaptor ligated to the single-stranded nucleic acid template, (c)performing linear amplification using the primer to generate a linearamplification product comprising a primer and sequence complementary tothe single-stranded nucleic acid template, and (d) ligating a secondsingle-stranded adaptor to a 3′ end of the linear amplification product.The linear amplification can be performed under isothermal conditions.The linear amplification can be performed under cycling temperatureconditions. The linear amplification can be performed with a polymerase,e.g., a Bst DNA polymerase, a thermostable polymerase. The method canfurther comprises pre-adenylating the single-stranded nucleic acidtemplate or first single-stranded adaptor prior to the ligating in step(a). The single-stranded nucleic acid template and/or thesingle-stranded adaptors can be phosphorylated prior to the ligating. Insome embodiments, the method comprises phosphorylating a 5′ end of thefirst single-stranded adaptor and/or a 5′ end of the second-strandedadaptor. In some embodiments, the method comprises phosphorylating a 5′end of the single-stranded nucleic acid template. Unligated firstsingle-stranded adaptor can be removed after step (a); unligatedsingle-stranded nucleic acid fragment can be removed after step (d). Theamplification can involve polymerase chain reaction (PCR). Theamplification can be performed at a low level PCR cycle. In some cases,the amplification is performed using about 1 to about 15 cycles of PCR.In some cases, the amplification is performed using about 2 to about 15cycles of PCR. In some cases, the amplification is performed using about5 to about 12 cycles. In some cases, the amplification is performedusing about 10 to about 15 cycles. In some cases, the amplification isperformed using 1 cycle of PCR. In some cases, the amplification isperformed using 2 cycles of PCR. In some cases, the amplification isperformed using 10 cycles of PCR. In some cases, the amplification isperformed using 11 cycles of PCR. In some cases, the amplification isperformed using 12 cycles of PCR. In some cases, the amplification isperformed using 13 cycles of PCR. In some cases, the amplification isperformed using 14 cycles of PCR. In some cases, the linearamplification product of step (d) is sequenced using sequencingtechniques and platforms described herein or other techniques andplatforms in the field.

In some embodiments, the method comprises ligating a firstsingle-stranded adaptor to a 3′ end of a single-stranded templatenucleic acid fragment followed by linear amplification, wherein the anannealed primer is extended to generate an extension product withsequence complementary to the single-stranded template nucleic acidfragment and the first single-stranded adaptor. The primer can be atarget-specific oligonucleotide. The primer can be a universal primer.The primer can comprise a sequence complementary to the firstsingle-stranded adaptor. The first single-stranded adaptor can bephosphorylated at the 5′ end. The single-stranded template nucleic acidfragment and the first single-stranded adaptor ligation product can bepurified by removing unligated first single-stranded adaptor by, forexample, washing, sedimenting and decanting, or centrifuging. The linearamplification can generates a double-stranded DNA fragment, which can bedenatured to generate a single-stranded DNA fragment comprising thesingle-stranded template nucleic acid and the first single-strandedadaptor, and a single-stranded DNA fragment comprising sequencecomplementary sequence to the single-stranded template nucleic acid andthe first single-stranded adaptor. The purified single-stranded templatenucleic acid fragment and the first single-stranded adaptor ligationproduct can be sequenced using techniques and platforms described hereinor other techniques and platforms in the field. The purifiedsingle-stranded template nucleic acid fragment and the firstsingle-stranded adaptor ligation product can be used for generating atarget-selective library preparation. For example, the primer cancomprise a target-specific oligonucleotide that anneals to a specificregion of the single-stranded template nucleic acid. The target-specificoligonucleotide can be a TSO. The method can further comprise ligatingthe purified single-stranded template nucleic acid fragment and thefirst single-stranded adaptor ligation product to a secondsingle-stranded adaptor having a phosphorylation on a 5′ end, therebygenerating a single-stranded DNA fragment comprising the single-strandedtemplate, the first single-stranded adaptor on one end and the secondsingle-stranded adaptor on the other end. In some cases, thesingle-stranded template nucleic acid fragment and the firstsingle-stranded adaptor and the second single-stranded adaptor ligationproduct is amplified using PCR prior to sequencing, using techniques andplatforms described herein or standard techniques and platforms in thefield.

In some embodiments, the method further comprises: hybridizing a firstprimer complementary to the first oligonucleotide sequence at the 3′-endof at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 30%, 50%,70%, 90%, 95%, 99%, or at least 100% of the plurality of modifiednucleic acid (e.g., RNA or DNA) fragments and extending the hybridizedfirst primer (6060). In some embodiments, the nucleic acid fragments aresingle-strand RNA fragments or ssDNA fragments. As a non-limitingexample, linear amplification can be performed for a number of cycles(e.g., about, or at least 1, 5, 10, 100, 1000, or 10,000 cycles). Linearamplification can yield nucleic acid, e.g., DNA fragments comprising attheir 3′ end a region complementary to the nucleic acid fragments (e.g.,RNA fragment or ssDNA fragment) and at their 5′ end a regioncomplementary to the first adaptor. Linear amplification can beperformed by a DNA polymerase. In some cases, extension is performedwith a reverse transcriptase. In particular embodiments, the DNApolymerase is a thermostable polymerase. The thermostable polymerase mayoriginate from a thermophilic bacterium or from Archaea. Exemplarythermostable polymerases include, but are not limited to, Thermusaquaticus (Taq polymerase), Pyrococcus furiosus (Pfu polymerase), Vent®DNA Polymerase gene from Thermococcus litoralis, Deep Vent™ polymerasefrom Pyrococcus sp., Platinum® Pfx polymerase, Tfi polymerase fromThermus filiformis, Pwo polymerase, chimeric DNA polymerases comprisinga DNA binding protein (e.g., Phusion, iProof), topoisomerase. In someembodiments, the polymerase is capable of isothermal amplification. Thepolymerase can be, e.g., Bst DNA polymerase, Bca DNA polymerase, E. coliDNA polymerase I, the Klenow fragment of E. coli DNA polymerase I, TaqDNA polymerase, T7 DNA polymerase (Sequenase).

The linearly amplified strand can be purified, e.g., by a methoddescribed herein.

In some embodiments, the method comprises ligating a secondoligonucleotide comprising a sequence, e.g., a sequence complementary atleast partially to a NGS adaptor sequence, e.g., a second adaptor asfurther described herein (e.g., an NGS adaptor that optionally containsa sample-identifying barcode) to the 3′-end of at least, or about 1%,2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 20%, 30%, 40%, 50%, 70%, 90%, 95%,99%, or 100% of the plurality of extension products, e.g., linearamplification products, to generate a plurality of modified linearamplification products, as described herein (6070). The secondoligonucleotide can be of a length of about 10 nts to about 150 nts, alength of about 15 nts to about 80 nts, a length of about 18 to about 25nts, or a length of about 19 nts. The second oligonucleotide canoptionally contain a moiety (e.g., biotin) capable of binding to animmobilized capturing reagent (e.g., a streptavidin/avidin solid support(e.g., beads, resin or column)), or can be attached to a solid support(e.g., beads (e.g., magnetic beads) or a flow cell). The linearamplification product comprising an adaptor sequence on each end can bepurified. The linear amplification product comprising an adaptorsequence on each end can be sequenced.

In some embodiments, the method further comprises: (i) ligating a firstadaptor to the 3′-end of at least about 10%, 30%, 50%, 70%, 90% or 95%of the plurality of single-stranded nucleic acid (e.g., ssRNA or ssDNA)fragments; annealing a first primer to the adaptor and performing linearamplification for an appropriate number of cycles to yield extensionproducts comprising a region complementary to a target DNA sequence ofinterest and a complement of the first adaptor sequence (ii) hybridizinga target-selective oligonucleotide (TSO) to at least one member of theplurality of extension products, wherein the TSO anneals to thecomplement of the target sequence and comprises a second adaptorsequence at the 5′-end of the TSO, wherein the second adaptor sequenceis different from the first adaptor sequence and optionally contains astrand-identifying barcode; and (ii) extending the hybridized TSO andperforming linear amplification for an appropriate number of cycles(e.g., about 40 cycles) as described herein to produce amplificationproducts comprising a sequence identical to at least a portion of thetarget nucleic acid sequence, a sequence identical to at least a portionof the first adaptor sequence, and a sequence identical to at least aportion of the second adaptor sequence. In certain embodiments, the TSOcomprises a sequence having at least about 50%, 60%, 70%, 80%, 90% or95% identity or complementarity to a region of a cancer-related gene. Aplurality of TSOs targeting the same nucleic acid (e.g., RNA or DNA)sequence of interest, or a plurality of TSOs targeting a plurality ofdifferent nucleic acid (e.g., RNA or DNA) sequences of interest, can beused. Linear amplification can be performed in solution, or on a solidsurface (e.g., biotinylated fragments captured on a streptavidin/avidinsolid support, or direct attachment of the first adaptor at the 3′-endof the DNA fragments to a solid support), which can facilitate isolationof the amplification products.

In some embodiments, the method comprises ligating a firstsingle-stranded adaptor to a 5′ end of a single-stranded templatenucleic acid fragment followed by ligating a second single-strandedadaptor to a 3′ end of the single-stranded template nucleic acidfragment, wherein both the single-stranded template nucleic acid and thesecond single-stranded adaptor are phosphorylated at the 5′ end (seeFIG. 61). The ligation can generate a ligation product comprising asingle-stranded template nucleic acid fragment comprising the firstsingle-stranded adaptor on the 5′ end and the second single-strandedadaptor on the 3′ end. A primer, e.g., a target-specificoligonucleotide, e.g., a TSO, can be annealed to the product. A primer,e.g., a universal primer, can be annealed to the product. The primer cancomprise a sequence complementary to the second single-stranded adaptor.The ligation product can be extended, e.g., by one round of extension,or by linear amplification, wherein a primer annealed to the secondsingle-stranded adaptor is extended to generate an extension product.The extension can comprise use of a reverse transcriptase, e.g., whenthe single-stranded nucleic acid template comprises RNA. The method canfurther comprise amplification (e.g., PCR expansion) of the extensionproduct, e.g., using primer that anneals to the complement of the firstsingle-stranded adaptor and a primer that anneals to the secondsingle-stranded adaptor. The single-stranded nucleic acid fragment canbe RNA (e.g., mRNA) or DNA (e.g., cDNA, genomic DNA). The method can beused for whole-genome sequencing or whole transcriptome sequencing. Thefirst and/or second single-stranded adaptor can comprise DNA and/or RNA.

In some embodiments of yet another library preparation method, DNAfragments (e.g. ssDNA fragments, dsDNA fragments) are generated fromgenomic DNA and a first adaptor sequence (e.g., an NGS adaptor thatoptionally contains a sample-identifying barcode) is ligated to the5′-phosphorylated end of at least about 10%, 20%, 30%, 40%, 50%, 60%,70%, 80%, 90% or 95% of the plurality of DNA fragments (e.g., ssDNAfragments or dsDNA fragments), the fragments are adenylated prior toligation, as described herein. In some embodiments, the DNA fragmentsare not adenylated prior to ligation. The first adaptor at the 5′-endoptionally can contain a moiety (e.g., biotin) capable of binding to animmobilized capturing reagent (e.g., a streptavidin/avidin solid support[e.g., beads, resin or column]), or can be attached to a solid support(e.g., beads [including magnetic beads] or a flow cell). The 3′-end ofthe DNA fragments can optionally be capped (described in further detailbelow). Any DNA fragments not ligated at the 5′-end to an adaptor canoptionally be removed by capturing, e.g., sedimentation, or bybiotinylated fragments onto a streptavidin/avidin solid support andwashing away unligated fragments, or by washing away unligated fragmentsif the first adaptor at the 5′-end of the DNA fragments is directlyattached to a solid support. In some embodiments, the method furthercomprises: (i) hybridizing a target-selective oligonucleotide (TSO) toat least one member of the plurality of 5′-adaptor-ligated DNAfragments, wherein the TSO comprises a sequence complementary to atleast a portion of a target DNA sequence of interest and a secondadaptor sequence at the 5′-end of the TSO, wherein the second adaptorsequence is different from the first adaptor sequence and optionallycontains a strand-identifying barcode; and (ii) extending the hybridizedTSO and performing linear amplification for an appropriate number ofcycles (e.g., about 40 cycles) as described herein to produceamplification products comprising the second adaptor sequence, asequence complementary to at least a portion of the target DNA sequence,and a sequence complementary to at least a portion of the first adaptorsequence. In certain embodiments, the TSO comprises a sequence having atleast about 10%, 30%, 50%, 70%, 90% or 95% identity or complementarityto a region of a cancer-related gene. A plurality of TSOs targeting thesame DNA sequence of interest, or a plurality of TSOs targeting aplurality of different DNA sequences of interest, can be used. Linearamplification can be performed in solution, or on a solid surface (e.g.,biotinylated fragments captured on a streptavidin/avidin solid support,or direct attachment of the first adaptor at the 5′-end of the DNAfragments to a solid support), which can facilitate isolation of theamplification products.

In some embodiments of still another library preparation method, DNAfragments (e.g. ssDNA fragments, dsDNA fragments) are generated fromgenomic DNA as described herein. A first adaptor sequence (e.g., an NGSadaptor that optionally contains a sample-identifying barcode) isligated to the 5′-phosphorylated end of at least about 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, 90% or 95% of the plurality of DNA fragments(e.g. ssDNA fragments, dsDNA fragments). The DNA fragments areoptionally adenylated prior to ligation as described herein. In someembodiments, the DNA fragments are not adenylated prior to ligation. Insome embodiments, the method further comprises capping the 3′-end of atleast about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% of theplurality of DNA fragments (e.g. ssDNA fragments, dsDNA fragments) byany suitable method known in the art. For example, the 3′-end of thefragments can be capped by incorporating one, two or morephosphoramidates, phosphoromonothioate and/or phosphorodithioate groupsat the 3′-end (described in WO 1990/015065, which is incorporated hereinby reference in its entirety), which can increase the resistance of thefragments to degradation by exonucleases. Alternatively, the 3′-end ofthe fragments can be capped by addition of, e.g., a dideoxynucleotideusing a terminal transferase, an aminoalkyl-modified base or a biotinmoiety to the 3′-end, so that there is no 3′-OH group that canparticipate in a ligation reaction. In some embodiments, the methodfurther comprises: (i) hybridizing a target-selective oligonucleotide(TSO) to at least one member of the plurality of 5′-adaptor-ligated,3′-capped DNA fragments, wherein the TSO comprises a sequencecomplementary to at least a portion of a target DNA sequence of interestand a second adaptor sequence at the 5′-end of the TSO, wherein thesecond adaptor sequence is different from the first adaptor sequence andoptionally contains a strand-identifying barcode; and (ii) extending thehybridized TSO and performing linear amplification for an appropriatenumber of cycles (e.g., about 40-100 cycles) as described herein toproduce amplification products comprising the second adaptor sequence, asequence complementary to at least a portion of the target DNA sequence,and a sequence complementary to at least a portion of the first adaptorsequence. In certain embodiments, the TSO comprises a sequence having atleast about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% identityor complementarity to a region of a cancer-related gene. A plurality ofTSOs targeting the same DNA sequence of interest, or a plurality of TSOstargeting a plurality of different DNA sequences of interest, can beused.

In some embodiments of yet another library preparation method, DNAfragments (e.g. ssDNA fragments, dsDNA fragments) are generated fromgenomic DNA, a first adaptor sequence or pdo (e.g., an NGS adaptor thatoptionally contains a sample-identifying barcode) is ligated to the5′-phosphorylated end of at least about 10%, 20%, 30%, 40%, 50%, 60%,70%, 80%, 90% or 95% of the plurality of DNA fragments (e.g. ssDNAfragments, dsDNA fragments). The DNA fragments are optionally adenylatedprior to ligation), and the 3′-end of the fragments is optionallycapped. In some embodiments, the DNA fragments are not adenylated priorto ligation and the 3′-end of the fragment is not capped. The pdo cancontain a moiety (e.g., biotin) capable of binding to an immobilizedcapturing reagent, or the 5′-end of the pdo can be attached to a solidsupport. In some embodiments, the method further comprises: (i)hybridizing a target-selective oligonucleotide (TSO) to at least onemember of the plurality of 5′-adaptor-ligated DNA fragments, wherein theTSO comprises a sequence complementary to at least a portion of a targetDNA sequence of interest and a second adaptor sequence at the 5′-end ofthe TSO, wherein the second adaptor sequence is different from the firstadaptor sequence and optionally contains a strand-identifying barcode;and (ii) performing one cycle of extension of the hybridized TSO toproduce an extension product comprising the second adaptor sequence, asequence complementary to at least a portion of the target DNA sequence,and a sequence complementary to at least a portion of the first adaptorsequence. The TSO can contain a moiety (e.g., biotin) capable of bindingto an immobilized capturing reagent, or the 5′-end of the TSO can beattached to a solid support. In certain embodiments, the TSO comprises asequence having at least about 50%, 60%, 70%, 80%, 90% or 95% identityor complementarity to a region of a cancer-related gene. A plurality ofTSOs targeting the same DNA sequence of interest, or a plurality of TSOstargeting a plurality of different DNA sequences of interest, can beused. The extension product (or a plurality of extension products if aplurality of TSOs targeting the same or different DNA sequence(s) ofinterest are used) is optionally isolated after denaturing. In someembodiments, the method further comprises performing PCR (optionally ata lower level, such as about 10 to about 15 cycles, about 1 to about 15cycles, about 2 to about 10 cycles, about 3 to about 8 cycles, or about1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 cycles) on thesingle-stranded extension product(s) using primers complementary to atleast a portion of the first adaptor and second adaptor sequences of theextension product(s) as forward and reverse primers for PCR. PCR can beconducted in the presence of PEG (e.g., about 2-5% or 5-10% PEG), whichcan improve the efficiency of PCR. The elongation step of PCR can beperformed at a lower temperature (e.g., about 60-65° C.), and PCR can beconducted in the presence of PEG (e.g., about 5-10% PEG) to improve theefficiency of PCR at the lower elongation temperature. If the firstadaptor region at the 5′-end of the initial extension product(s) priorto PCR is attached to biotin, as an example, the ligated pdo-ssDNAproduct(s) can be captured by a streptavidin/avidin solid support andthe reactants and unligated ssDNA fragments from the initial extensioncan be washed away prior to PCR to give cleaner PCR results. Similarly,if the second adaptor region at the 5′-end of the initial extensionproduct(s) prior to PCR is attached to biotin, as an example, thebiotinylated extension product(s) can be captured by astreptavidin/avidin solid support and the reactants from the initialextension can be washed away prior to PCR to give cleaner PCR results.PCR then can be conducted in solution after removal of the biotinylatedextension product(s) from the solid support, or can be conducted on thesolid support. Alternatively, if the 5′-end of the TSO(s) is attached toa solid support, the reactants from the initial extension prior to PCRcan be washed away from the solid support, and PCR can then be conductedon solid support. FIGS. 45A and 45B depict a solution-phase embodimentof this library preparation method, and FIGS. 46A and 46B depict asolid-phase embodiment of this method.

In some embodiments, a single-stranded adaptor can be ligated to a 5′ or3′ end of a single-stranded nucleic acid fragment, e.g., RNA or DNA,e.g., genomic DNA. The single-stranded adaptor can comprise an affinitytag or a reactive moiety (see, e.g., FIG. 46A). The affinity tag orreactive moiety can be biotinyl-TEG, aminohexyl, or acrydite. Thesingle-stranded nucleic acid fragment can be a single-stranded DNAfragment. The single-stranded nucleic acid fragment can be asingle-stranded DNA fragment generated from a double-stranded nucleicacid fragment, for example, by denaturing the double-stranded nucleicacid fragment. The double-stranded nucleic acid can be genomic DNA. Thesingle-stranded adaptor can be coupled to a solid support. Thesingle-stranded adaptor can be coupled to a solid support prior toligating to the single-stranded adaptor. In some cases, thesingle-stranded adaptor is coupled to a solid support after ligating tothe single-stranded adaptor. In some cases, the single-stranded nucleicacid fragment is pre-adenlyated using techniques and reagents describedherein. The solid support can comprise a paramagnetic material, forexample, streptavidin polystyrene bead, (streptavidin) polyacrylamidebead, tosyl-activated carboxylated bead, or NETS-activated carboxylatedbead. Unligated single-stranded nucleic acid fragment can be purifiedfrom ligated single-stranded nucleic acid fragment prior to subsequentprocedures, e.g., annealing, extending and amplifying a target-specificoligonucleotide to the single-stranded nucleic acid fragment. Atarget-specific oligonucleotide probe (e.g., a TSO) can be annealed tothe single-stranded nucleic acid fragment that is coupled to the solidsupport. The target-specific oligonucleotide probe (e.g., a TSO) cancomprise a 3′ end that anneals to the target sequence and a 5′ endcomprising a second adaptor sequence. The second adaptor sequence can becomplementary to the single-stranded DNA fragment on the 3′ end. In somecases, the target-specific oligonucleotide probe (e.g., a TSO) annealsto the single-stranded nucleic acid fragment on the 3′ end and extendsto generate a full length complementary fragment of the single-strandednucleic acid fragment comprising the target-specific oligonucleotideprobe (e.g., a TSO) on one end and the single-stranded adaptor on theother end. In some cases, the target-specific oligonucleotide probe(e.g., a TSO) anneals to a region of the single-stranded nucleic acidfragment and extends to generate a sub-fragment of the single-strandednucleic acid fragment comprising the target-specific oligonucleotideprobe (e.g., a TSO) on one end and the single-stranded adaptor on theother end. The single-stranded nucleic acid fragment or sub-fragmentcomprising the target-specific oligonucleotide probe (e.g., a TSO) canbe amplified using a first primer comprising sequence of the firstsingle-stranded adaptor and a second primer comprising sequence of thesecond adaptor. The amplification can be linear amplification. Theamplification can involve polymerase chain reaction (PCR). Theamplification can be performed at a low level PCR cycle. In some cases,the amplification is performed using about 1 to about 15 cycles of PCR.In some cases, the amplification is performed using about 2 to about 15cycles of PCR. In some cases, the amplification is performed using about5 to about 12 cycles. In some cases, the amplification is performedusing about 10 to about 15 cycles. In some cases, the amplification isperformed using 1 cycle of PCR. In some cases, the amplification isperformed using 2 cycles of PCR. In some cases, the amplification isperformed using 10 cycles of PCR. In some cases, the amplification isperformed using 11 cycles of PCR. In some cases, the amplification isperformed using 12 cycles of PCR. In some cases, the amplification isperformed using 13 cycles of PCR. In some cases, the amplification isperformed using 14 cycles of PCR. When the adaptor is ligated to the 3′end, a primer can be annealed to the 3′ adaptor sequence and extended,e.g., using a polymerase or reverse transcriptase. Linear amplificationor polymerase chain reaction can be performed.

As described above, solid-based technology can be advantageouslyemployed. An adaptor region at the 3′-end or the 5′-end of an ssDNAfragment can contain a moiety (e.g., biotin) capable of binding to animmobilized capturing reagent, such as a streptavidin/avidin solidsupport (e.g., beads [including magnetic beads], resin or column) forbinding to biotin, or the adaptor region can be attached to a solidsupport (e.g., beads [including magnetic beads] or a flow cell).Alternatively, an adaptor region at the 5′-end of a TSO can contain amoiety (e.g., biotin) capable of binding to an immobilized capturingreagent, or can be attached to a solid support (e.g., beads [includingmagnetic beads] or a flow cell). Solid-based methodologies can be used,e.g., to remove DNA fragments that are not ligated to an adaptor priorto hybridization with a TSO, to remove the reactants of the initialextension reaction of a hybridized TSO prior to any PCR being performed,and/or to facilitate the isolation and purification of amplificationproducts (whether amplification [e.g., linear amplification or PCR] isconducted in solution or on a solid surface), which can minimize thegeneration of artifacts and give cleaner results.

In some embodiments of yet another library preparation method, nucleicacid fragment (e.g., ssDNA fragments, genomic DNA fragments) isgenerated from genomic DNA. The method comprises (a) ligating a firstsingle-stranded adaptor to a 5′ end of a single-stranded nucleic acidfragment, (b) ligating a second single-stranded adaptor to a 3′ end ofthe single-stranded nucleic acid fragment, thereby generating asingle-stranded nucleic acid fragment comprising a 5′ firstsingle-stranded adaptor and a 3′ second single-stranded adaptorfollowing step (a) and step (b), and sequencing the single-strandednucleic acid fragment comprising a 5′ first single-stranded adaptor anda 3′ second single-stranded adaptor. Ligation of the firstsingle-stranded adaptor and the second single-stranded adaptor can occursequentially in any order. In one example, ligation of the firstsingle-stranded adaptor occurs prior to ligation of the secondsingle-stranded adaptor, and wherein the ligation occurs in a reactionmixture that lacks the second single-stranded adaptor. In anotherexample, ligation of the second single-stranded adaptor occurs prior toligation of the first single-stranded adaptor, and wherein the ligationoccurs in a reaction mixture that lacks the first single-strandedadaptor. Ligation of the first single-stranded adaptor and the secondsingle-stranded adaptor can occur simultaneously. In one example,ligation of the first single-stranded adaptor occurs simultaneously withligation of the second single-stranded adaptor, and wherein the ligationoccurs in a reaction mixture that comprise both the firstsingle-stranded adaptor and the second-stranded adaptor. The method mayfurther comprises phosphorylating a 5′ end of the single-strandednucleic acid fragment before step (a) and/or step (b). The firstsingle-stranded adaptor may be pre-adenylated before step (a). Thesecond single-stranded adaptor may be pre-adenylated before step (b).Unligated single-stranded nucleic acid fragment after step (a) can bepurified from the ligated single-stranded nucleic acid fragment prior tostep (c). Accordingly, unligated single-stranded nucleic acid fragmentafter step (b) can be purified from the ligated single-stranded nucleicacid fragment prior to step (c). The method may further compriseamplifying the single-stranded nucleic acid fragment comprising a 5′first single-stranded adaptor and a 3′ single-stranded adaptor beforestep (c). The amplification may involve polymerase chain reaction (PCR).The amplification can be performed at a low level PCR cycle. In somecases, the amplification is performed using about 1 to 15 cycles of PCR.In some cases, the amplification is performed using about 2-15 cycles ofPCR. In some cases, the amplification is performed using about 5-12cycles. In some cases, the amplification is performed using about 10-15cycles. In some cases, the amplification is performed using 1 cycle ofPCR. In some cases, the amplification is performed using 2 cycles ofPCR. In some cases, the amplification is performed using 10 cycles ofPCR. In some cases, the amplification is performed using 11 cycles ofPCR. In some cases, the amplification is performed using 12 cycles ofPCR. In some cases, the amplification is performed using 13 cycles ofPCR. In some cases, the amplification is performed using 14 cycles ofPCR.

cDNA Library Preparation from RNA

All the disclosure herein relating to library preparation from genomicDNA can be modified for and applied to the preparation of cDNA librariesfrom RNA. As a non-limiting example of a cDNA library preparationmethod, RNA fragments are generated from total RNA or a certain type ofRNA (e.g., mRNA) as described herein. A first adaptor sequence (e.g., anNGS adaptor that optionally contains a sample-identifying barcode) isligated to the 5′-phosphorylated end of at least about 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, 90% or 95% of the plurality of RNA fragments(the fragments are optionally adenylated prior to ligation), similar toadaptor ligation to DNA fragments. The first adaptor at the 5′-end ofthe RNA fragments optionally can contain a moiety (e.g., biotin) capableof binding to an immobilized capturing reagent (e.g., astreptavidin/avidin solid support [e.g., beads, resin or column]), orcan be attached to a solid support (e.g., beads [including magneticbeads] or a flow cell). The 3′-end of the RNA fragments can optionallybe capped, similar to end-capping of DNA fragments. Any RNA fragmentsnot ligated at the 5′-end to an adaptor can optionally be removed bycapturing, e.g., biotinylated fragments with a streptavidin/avidin solidsupport and washing away unligated fragments, or by washing awayunligated fragments if the first adaptor at the 5′-end of the RNAfragments is directly attached to a solid support. In some embodiments,the method further comprises: (i) hybridizing a target-selectiveoligonucleotide (TSO) to at least one member of the plurality of5′-adaptor-ligated RNA fragments, wherein the TSO comprises a sequencecomplementary to at least a portion of a target RNA sequence of interestand a second adaptor sequence at the 5′-end of the TSO, wherein thesecond adaptor sequence is different from the first adaptor sequence andoptionally contains a strand-identifying barcode; and (ii) extending thehybridized TSO and performing amplification for an appropriate number ofcycles (e.g., about 40-100 cycles) to produce amplification productscomprising the second adaptor sequence, a sequence complementary to atleast a portion of the target RNA sequence, and a sequence complementaryto at least a portion of the first adaptor sequence. In certainembodiments, the TSO comprises a sequence having at least about 50%,60%, 70%, 80%, 90% or 95% identity or complementarity to a region of acancer-related gene or mRNA. A plurality of TSOs targeting the same RNAsequence of interest, or a plurality of TSOs targeting a plurality ofdifferent RNA sequences of interest, can be used. Amplification can beperformed using a reverse transcriptase for reverse transcription of RNAsequences and a DNA polymerase for replication of DNA sequences (e.g.,the first adaptor region ligated to the 5′-end of the RNA fragments), orusing an enzyme having both reverse transcriptase activity and DNApolymerase activity, such as Tth DNA polymerase. Amplification can beperformed in solution, or on a solid surface (e.g., biotinylatedfragments captured on a streptavidin/avidin solid support, or directattachment of the first adaptor at the 5′-end of the RNA fragments to asolid support), which can facilitate isolation of the cDNA amplificationproducts.

As another example of a cDNA library preparation method, RNA fragmentsare generated from total RNA or a certain type of RNA (e.g., mRNA), afirst adaptor sequence (e.g., an NGS adaptor that optionally contains asample-identifying barcode) is ligated to the 5′-phosphorylated end ofat least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% of theRNA fragments (the fragments are optionally adenylated prior toligation), and the 3′-end of the RNA fragments is optionally capped. Insome embodiments, the method further comprises: (i) hybridizing atarget-selective oligonucleotide (TSO) to at least one member of theplurality of 5′-adaptor-ligated RNA fragments, wherein the TSO comprisesa sequence complementary to at least a portion of a target RNA sequenceof interest and a second adaptor sequence at the 5′-end of the TSO,wherein the second adaptor sequence is different from the first adaptorsequence and optionally contains a strand-identifying barcode; and (ii)performing one cycle of extension of the hybridized TSO to produce acDNA extension product comprising the second adaptor sequence, asequence complementary to at least a portion of the target RNA sequence,and a sequence complementary to at least a portion of the first adaptorsequence. The extension reaction is performed using a reversetranscriptase for reverse transcription of RNA sequences and a DNApolymerase for replication of DNA sequences (e.g., the first adaptorregion ligated to the 5′-end of the RNA fragments), or using an enzymethat has both reverse transcriptase activity and DNA polymeraseactivity, such as Tth DNA polymerase. The TSO can contain a moiety(e.g., biotin) capable of binding to an immobilized capturing reagent,or the 5′-end of the TSO can be attached to a solid support. In certainembodiments, the TSO comprises a sequence having at least about 50%,60%, 70%, 80%, 90% or 95% identity or complementarity to a region of acancer-related gene or mRNA. A plurality of TSOs targeting the same RNAsequence of interest, or a plurality of TSOs targeting a plurality ofdifferent RNA sequences of interest, can be used. The cDNA extensionproduct (or a plurality of cDNA extension products if a plurality ofTSOs targeting the same or different RNA sequence(s) of interest areused) is optionally isolated after denaturing. In some embodiments, themethod further comprises performing PCR (optionally at a lower level,such as about 10 to about 15 cycles) on the single-stranded cDNAextension product(s) using primers complementary to at least a portionof the first adaptor and second adaptor sequences of the cDNA extensionproduct(s) as forward and reverse primers for PCR. PCR can be conductedin the presence of PEG (e.g., about 2-5% or 5-10% PEG). The elongationstep of PCR can be performed at a lower temperature (e.g., about 60-65°C.), and PCR can be conducted in the presence of PEG (e.g., about 5-10%PEG) to improve the efficiency of PCR at the lower elongationtemperature. If the second adaptor region at the 5′-end of the initialcDNA extension product(s) prior to PCR is attached to biotin, as anexample, the biotinylated cDNA extension product(s) can be captured by astreptavidin/avidin solid support and the reactants from the initialextension can be washed away prior to PCR to give cleaner PCR results.PCR then can be conducted in solution after removal of the biotinylatedcDNA extension product(s) from the solid support, or can be conducted onthe solid support. Alternatively, if the 5′-end of the TSO(s) isattached to a solid support, the reactants from the initial extensionprior to PCR can be washed away from the solid support, and PCR can thenbe conducted on solid support. FIGS. 45A and 45B depict a solution-phaseembodiment of this cDNA library preparation method, and FIGS. 46A and46B depict a solid-phase embodiment of this method.

As another example of a cDNA library preparation method, cDNA fragmentsare generated from total RNA or a certain type of RNA (e.g., mRNA),using primers, e.g., random primed reverse transcription, where theprimers, e.g., random primer, is phosphorylated at the 5′ end (see,e.g., FIG. 62). The total RNA or the certain type of RNA (e.g., mRNA)can be cell-free nucleic acid from a biological sample. The total RNA orthe certain type of RNA (e.g., mRNA) may be fragmented. The total RNA orthe certain type of RNA (e.g., mRNA) can comprise a junction between twogenes resulting from a gene fusion. The gene fusion may be associatedwith a cancer. The random primer may have a hexamer sequence. A firstadaptor sequence (e.g., an NGS adaptor that optionally contains asample-identifying barcode), e.g., single-stranded first adaptorsequence, can be ligated to the 5′-phosphorylated end of at least, orabout, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% 95%, or 100% of thecDNA fragments (the fragments can be optionally adenylated prior toligation), and the 3′-end of the cDNA fragments can be optionallycapped. The 5′ phosphorylated end can be adenylated. In someembodiments, the method further comprises: (i) hybridizing atarget-selective oligonucleotide (TSO) to at least one member of theplurality of 5′-adaptor-ligated cDNA fragments, wherein the TSOcomprises a sequence complementary to at least a portion of a targetcDNA sequence of interest and a second adaptor sequence at the 5′-end ofthe TSO, wherein the second adaptor sequence is different from the firstadaptor sequence and optionally contains a strand-identifying barcode;and (ii) performing one cycle of extension of the hybridized TSO toproduce an extension product comprising the second adaptor sequence, asequence complementary to at least a portion of the target cDNAsequence, and a sequence complementary to at least a portion of thefirst adaptor sequence. The target sequence can comprise a genesequence. The extension reaction can be performed using a DNA polymerasefor replication of DNA sequences (e.g., the first adaptor region ligatedto the 5′-end of the cDNA fragments) as described herein. The TSO cancontain a moiety (e.g., biotin) capable of binding to an immobilizedcapturing reagent, or the 5′-end of the TSO can be attached to a solidsupport. In certain embodiments, the TSO comprises a sequence having atleast about 50%, 60%, 70%, 80%, 90% or 95% identity or complementarityto a region of a cancer-related gene or mRNA. A plurality of TSOstargeting the same sequence (e.g., cDNA sequence) of interest, or aplurality of TSOs targeting a plurality of different sequences ofinterest (e.g., cDNA sequences of interest), can be used. The secondstrand extension product (e.g., second strand cDNA) (or a plurality ofcDNA extension products if a plurality of TSOs targeting the same ordifferent RNA sequence(s) of interest are used) is optionally isolatedafter denaturing. In some embodiments, the method further comprisesperforming PCR (optionally at a lower level, such as about 10 to about15 cycles, or about 2 to about 15 cycles) on the single-stranded secondextension product(s) (e.g., second strand cDNA) using a primer with atleast a portion of the first adaptor sequence and a primer with sequenceand a primer with least a portion of the second adaptor sequence. PCRcan be conducted in the presence of PEG (e.g., about 2-5% or 5-10% PEG).The elongation step of PCR can be performed at a lower temperature(e.g., about 60 to about 65° C.), and PCR can be conducted in thepresence of PEG (e.g., about 5-10% PEG) to improve the efficiency of PCRat the lower elongation temperature. The PCR can occur in solution. Ifthe second adaptor region at the 5′-end of the second extensionproduct(s) (e.g., second cDNA strand) prior to PCR is attached tobiotin, as an example, the biotinylated second extension product(s)(e.g., second cDNA strand) can be captured by a streptavidin/avidinsolid support and the reactants from the initial extension can be washedaway prior to PCR, e.g., to give cleaner PCR results. PCR then can beconducted in solution after removal of the biotinylated extensionproduct(s) (e.g., cDNA) from the solid support, or can be conducted onthe solid support. If the 5′-end of the TSO(s) is attached to a solidsupport, the reactants from the initial extension prior to PCR can bewashed away from the solid support, and PCR can then be conducted onsolid support. FIGS. 45A and 45B depict a solution-phase embodiment ofthis cDNA library preparation method, and FIGS. 46A and 46B depict asolid-phase embodiment of this method. The products of the amplifyingcan be used to detect a gene fusion event, e.g., a gene fusion eventassociated with cancer.

As another example of a cDNA library preparation method, cDNA fragmentsare generated from total RNA or a certain type of RNA (e.g., mRNA),using random primed reverse transcription, wherein, the total RNA or thecertain type of RNA is phosphorylated at the 5′ end. The total RNA orthe certain type of RNA (e.g., mRNA) is cell-free nucleic acid from abiological sample. The total RNA or the certain type of RNA (e.g., mRNA)may be fragmented. The total RNA or the certain type of RNA (e.g., mRNA)comprises a junction between two genes resulting from a gene fusion. Thegene fusion may be associated with a cancer. The random primer may havea hexamer sequence. A first adaptor sequence (e.g., an NGS adaptor thatoptionally contains a sample-identifying barcode), e.g., single-strandedfirst adaptor sequence, is ligated to the 5′-phosphorylated end of atleast, or about, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% 95%, or100% of the cDNA fragments (the fragments can be optionally adenylatedprior to ligation) to generate a single-stranded nucleic acid fragmentcomprising a 5′ adaptor. The 3′-end of the cDNA fragments can beoptionally capped. The unligated first single-stranded adaptor orunhybridized TSO can be removed from the single-stranded nucleic acidfragment comprising the 5′ adaptor by washing, sedementing, decanting,and centrifuging. In some embodiments, the method further comprises: (i)hybridizing a target-selective oligonucleotide (TSO) probe to at leastone member of the plurality of 5′-adaptor-ligated cDNA fragments tocreate a hybridization product, wherein the TSO probe comprises asequence complementary to at least a portion of a target cDNA sequenceof interest or a 3′ end that anneals to the target sequence and a 5′ endcomprises a second adaptor, wherein the second adaptor sequence isdifferent from the first adaptor sequence and optionally contains astrand-identifying barcode; and (ii) performing one cycle of extensionof the hybridized TSO to produce a cDNA extension product comprising thesecond adaptor sequence, a sequence complementary to at least a portionof the target cDNA sequence, and a sequence complementary to at least aportion of the first adaptor sequence. The target sequence may comprisea gene sequence. The extension reaction can be performed using a DNApolymerase for replication of DNA sequences (e.g., the first adaptorregion ligated to the 5′-end of the cDNA fragments) as described herein.The TSO can contain a moiety (e.g., biotin) capable of binding to animmobilized capturing reagent, or the 5′-end of the TSO can be attachedto a solid support. In certain embodiments, the TSO comprises a sequencehaving at least about 50%, 60%, 70%, 80%, 90% or 95% identity orcomplementarity to a region of a cancer-related gene or mRNA. Aplurality of TSOs targeting the same cDNA sequence of interest, or aplurality of TSOs targeting a plurality of different cDNA sequences ofinterest, can be used. The cDNA extension product (or a plurality ofcDNA extension products if a plurality of TSOs targeting the same ordifferent RNA sequence(s) of interest are used) is optionally isolatedafter denaturing. In some embodiments, the method further comprisesperforming PCR (optionally at a lower level, such as about 10-15 cycles)on the single-stranded cDNA extension product(s) using a primer with atleast a portion of the first adaptor sequence and a primer with sequenceand a primer with least a portion of the second adaptor sequence. PCRcan be conducted in the presence of PEG (e.g., about 2-5% or 5-10% PEG).The elongation step of PCR can be performed at a lower temperature(e.g., about 60-65° C.), and PCR can be conducted in the presence of PEG(e.g., about 5-10% PEG) to improve the efficiency of PCR at the lowerelongation temperature. The PCR can occur in solution. If the secondadaptor region at the 5′-end of the initial cDNA extension product(s)prior to PCR is attached to biotin, as an example, the biotinylated cDNAextension product(s) can be captured by a streptavidin/avidin solidsupport and the reactants from the initial extension can be washed awayprior to PCR to give cleaner PCR results. PCR then can be conducted insolution after removal of the biotinylated cDNA extension product(s)from the solid support, or can be conducted on the solid support. If the5′-end of the TSO(s) is attached to a solid support, the reactants fromthe initial extension prior to PCR can be washed away from the solidsupport, and PCR can then be conducted on solid support.

As a further example of a cDNA library preparation method, RNA fragmentsare generated from total RNA or a certain type of RNA (e.g., mRNA). Insome embodiments, the method further comprises: (i) hybridizing atarget-selective oligonucleotide (TSO) to at least one member of theplurality of RNA fragments, wherein the TSO comprises a sequencecomplementary to at least a portion of a target RNA sequence of interestand a second adaptor sequence at the 5′-end of the TSO, wherein thesecond adaptor sequence optionally contains a strand-identifyingbarcode; and (ii) performing one cycle of extension of the hybridizedTSO, using a reverse transcriptase for reverse transcription of RNAsequences, to produce a cDNA extension product comprising the secondadaptor sequence and a sequence complementary to at least a portion ofthe target RNA sequence. The TSO can contain a moiety (e.g., biotin)capable of binding to an immobilized capturing reagent, or the 5′-end ofthe TSO can be attached to a solid support. In certain embodiments, theTSO comprises a sequence having at least about 50%, 60%, 70%, 80%, 90%or 95% identity or complementarity to a region of a gene, e.g., acancer-related gene or mRNA. A plurality of TSOs targeting the same RNAsequence of interest, or a plurality of TSOs targeting a plurality ofdifferent RNA sequences of interest, can be used. After removal of thecomplementary RNA fragment(s) (e.g., by heat denaturing, alkalinehydrolysis, or enzymatic digestion of RNA (e.g., using RNase H)), thecDNA extension product (or a plurality of cDNA extension products if aplurality of TSOs targeting the same or different RNA sequence(s) ofinterest are used) can be isolated, e.g., by capturing biotinylated cDNAextension product(s) onto a streptavidin/avidin solid support andwashing away the reactants from the extension reaction, or by washingaway the reactants from the extension reaction if the 5′-end of theTSO(s) is attached to a solid support. In some embodiments, the methodfurther comprises ligating a first adaptor sequence (e.g., an NGSadaptor that is different from the second adaptor sequence andoptionally contains a sample-identifying barcode) to the 3′-end of atleast, or about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 20%, 30%, 40%,50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% of the single-stranded cDNAextension product(s). In some embodiments, the method further comprisesperforming PCR (optionally at a lower level, such as about 10 to about15 cycles) on the 3′-adaptor-ligated cDNA extension product(s) usingprimers complementary to at least a portion of the first adaptor andsecond adaptor sequences of those cDNA extension product(s) as forwardand reverse primers for PCR. PCR can be conducted in the presence of PEG(e.g., about 2 to about 5% or about 5 to about 10% PEG). The elongationstep of PCR can be performed at a lower temperature (e.g., about 60 toabout 65° C.), and PCR can be conducted in the presence of PEG (e.g.,about 5 to about 10% PEG) to improve the efficiency of PCR at the lowerelongation temperature. PCR can be performed in solution, or on a solidsurface (e.g., biotinylated cDNA extension product(s) captured on astreptavidin/avidin solid support, or direct attachment of the 5′-end ofthe cDNA extension product(s) to a solid support).

In some embodiments, a 5′-phosphorylated first adaptor sequence (e.g.,an NGS adaptor that optionally contains a sample-identifying barcode) isligated to the 3′-end of at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%,9%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100% of theplurality of RNA fragments (the fragments are optionally adenylatedprior to ligation), e.g., similar to adaptor ligation to DNA fragments.The first adaptor at the 3′-end of the RNA fragments optionally cancontain a moiety (e.g., biotin) capable of binding to an immobilizedcapturing reagent (e.g., a streptavidin/avidin solid support (e.g.,beads, resin or column)), or can be attached to a solid support (e.g.,beads, e.g., magnetic beads, or a flow cell). The 5′-end of the RNAfragments can optionally be capped, similar to end-capping of DNAfragments. Any RNA fragments not ligated at the 3′-end to an adaptor canoptionally be removed by capturing, e.g., biotinylated fragments with astreptavidin/avidin solid support and washing away unligated fragments,or by washing away unligated fragments if the first adaptor at the3′-end of the RNA fragments is directly attached to a solid support. Insome embodiments, the method further comprises hybridizing a firstprimer to the first adaptor sequences at the 3′-end of at least about1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 30%, 50%, 70%, 90%, 95%, 99%,or 100% of the plurality of modified RNA fragments and extending thehybridized first primer. Linear amplification can be performed for anumber of cycles (e.g., about 1, 5, 10, 100, or 10,000) to yieldfragments comprising a region complementary to the target DNA sequenceof interest and the first adaptor sequence. Linear amplification can beperformed by a DNA polymerase. In particular embodiments, the DNApolymerase is a thermostable polymerase. The thermostable polymerase canoriginate from a thermophilic bacterium or from Archaea. Exemplarythermostable polymerases include, but are not limited to, Thermusaquaticus (Taq polymerase), Pyrococcus furiosus (Pfu polymerase), Vent®DNA Polymerase gene from Thermococcus litoralis, Deep Vent™ polymerasefrom Pyrococcus sp., Platinum® Pfx polymerase, Tfi polymerase fromThermus filiformis, Pwo polymerase, chimeric DNA polymerases comprisinga DNA binding protein (e.g., Phusion, iProof), topoisomerase. In someembodiments, the polymerase is capable of isothermal amplification. Thepolymerase can be, e.g., Bst DNA polymerase, Bca DNA polymerase, E. coliDNA polymerase I, the Klenow fragment of E. coli DNA polymerase I, TaqDNA polymerase, T7 DNA polymerase (Sequenase). The linearly amplifiedstrand can be purified. In some embodiments, the method comprisesligating a second adaptor comprising a sequence, e.g., a sequencecomplementary at least partially to a NGS adaptor sequence, e.g., asecond adaptor as further described below (e.g., an NGS adaptor thatoptionally contains a sample-identifying barcode) to the 3′-end of atleast, or about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 30%, 50%, 70%,90%, 95%, 99%, or 100% of the plurality of linear amplification productsto generate a plurality of modified linear amplification products. Thesecond adaptor may be of a length of about 15 nts to about 80 nts, about18 to about 25 nts, or about 19 nts. The second adaptor can optionallycontain a moiety (e.g., biotin) capable of binding to an immobilizedcapturing reagent (e.g., a streptavidin/avidin solid support (e.g.,beads, resin or column)), or can be attached to a solid support (e.g.,beads, e.g., magnetic beads, or a flow cell). The linear amplificationproduct comprising an adaptor sequence on each end can be purified. Thelinear amplification product comprising an adaptor sequence on each endcan be sequenced.

Array Hybridization Applications

The high efficiency ligation methods and kits described herein may alsobe used for the preparation of nucleic acid samples for arrayhybridization (e.g., nucleic acid microarray). Nucleic acid microarraytechniques generally refer to techniques that rely on hybridization ofnucleic acids to an array of oligonucleotide probes immobilized onto asolid or semi-solid surface. Nucleic acids (e.g., DNA) isolated from asample are generally prepared by labeling with a detectable label. Thelabeled nucleic acids can then be applied to an array containing aplurality of oligonucleotides of known sequence (e.g., probes)immobilized onto addressable locations of a solid surface. Theoligonucleotide probes may be hybridizable to a plurality of targetregions of interest. In some embodiments, the oligonucleotide probes maybe hybridizable to one or more adaptor sequences. The amount ofdetectable signal at a certain addressable location can indicate theamount of nucleic acids containing the target region in the sample.Exemplary microarray systems include, e.g., bead array systems(Illumina, Inc, Lynx Therapeutics, Luminex, Inc., Exiqon, Mycroarray)SNP arrays (available from, e.g., Agilent Technologies, Illumina, Inc.,Affymetrix, Inc., Life Technologies, Inc., Nimblegen, Exiqon,Mycroarray), and comparative genome hybridization arrays (availablefrom, e.g., Agilent Technologies, Illumina, Inc., Affymetrix, Inc., LifeTechnologies, Inc., Exiqon, Mycroarray). Bead array systems (availablefrom, e.g., Illumina, Lynx Therapeutics, Luminex, Inc.) generally referto array systems comprising microsphere beads impregnated with multiplecopies of oligonucleotide probes. Beads may be addressable either bydeposition into microwells or by barcoding with unique combinations offluorophores, which may be sorted and identified by any means known inthe art, including, e.g., flow cytometry. Exemplary bead array systemsand methods are described in U.S. Pat. Nos. 8,399,192 and 8,198,028,which are hereby incorporated by reference. SNP arrays generally referto arrays and systems that are configured to detect SNP alleles.Exemplary SNP arrays are described in, e.g., U.S. Pat. Nos. 6,410,231;6,858,394; US Patent Application Pub. Nos. 20090062138, and EP PatentApplication No. EP1207209, all of which are hereby incorporated byreference. Comparative genome hybridization (CGH) generally refers toarrays and systems that enable high-resolution, genome-wide screening ofsegmental genomic copy number variations (CNVs). CGH platforms candetect aneuploidies, microdeletion/microduplication syndromes, andchromosomal rearrangements. Exemplary CGH arrays and array methods aredescribed in, e.g., U.S. Pat. No. 6,410,243; hereby incorporated byreference.

Library preparation of nucleic acid samples (e.g., gDNA samples) forarray hybridization generally involves labeling individual nucleic acidfragments with a detectable label. The labeling method traditionallyinvolves hybridization of random primers to the nucleic acid fragments,followed by extension of the random primers by a polymerase. Theextension reaction incorporates labeled nucleotides into the extensionproduct. This method of labeling by extension by a polymerase canintroduce labeling bias into the resulting library.

The high-efficiency ligation methods described herein can overcome thelimitations of traditional library preparation methods for arrayhybridization by obviating the need for random primer hybridization andextension. Accordingly, in some aspects the disclosure provides methodsand kits for preparing a nucleic acid library for array hybridization.In some embodiments, the method comprises ligating a labeledoligonucleat least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% otideto at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of nucleicacids present in a sample, utilizing any of the methods as describedherein (see, e.g., FIG. 10). The labeled oligonucleotide may comprise adetectable label or capture moiety. Exemplary detectable labels andcapture moieties are described herein.

Barcoding Applications

Molecular barcoding is useful for the tracking, identification, and/orretrieval of individual nucleic acid molecules, subclasses of nucleicacid molecules, or samples of nucleic acid. Molecular barcodinggenerally involves tagging nucleic acid molecules with oligonucleotidesequences. The oligonucleotide sequences can be unique from sample tosample, from subclass to subclass, or from individual nucleic acid toindividual nucleic acid, as desired by a user. Exemplary barcodes aredescribed herein.

In one aspect, the high efficiency ligation method can be used tobarcode a plurality of nucleic acid molecules. In some embodiments, themethod comprises ligating a barcode sequence to a nucleic acid moleculeusing any of the methods as described herein. The methods describedherein can ensure that over 80%, over 85%, over 90%, over 95%, over 97%,over 98%, over 99%, over 99.5%, over 99.9%, or substantially all ofnucleic acids in a sample to be barcoded is ligated to a barcodesequence. In some embodiments, each of a plurality of nucleic acidsamples are barcoded by ligation to a single barcode sequence unique tothe sample. Such barcoding allows for sample origin to be identified inan assay. In other embodiments, a plurality of nucleic acids arebarcoded such that each individual nucleic acid in a sample is ligatedto a unique barcode sequence. Such barcoding allows for the tracking andidentification of individual nucleic acids in a sample. In eithermethod, nucleic acids in a sample can be adenylated in a reactionmixture as described herein, followed by ligation as described herein toa barcode sequence.

Cloning Applications

Molecular cloning often involves ligation of an insert DNA sequence intoa vector, e.g., a plasmid vector. Generally, insert DNA and vector areprepared by restriction digest, wherein restriction enzymes canrecognize a palindromic sequence within the insert DNA or vector anddigest it, producing compatible sticky ends. The digested insert andvector are then incubated together in a ligation reaction, with the goalof annealing the compatible sticky ends of the vector to insert,producing a desired product comprising the vector and insert. However,due to the palindromic sticky ends, spurious ligation products are alsocreated during the ligation process, including, e.g., insert-insertligations and vector/vector ligations. This reduces the efficiency andspecificity of the ligation reaction. As a result, a user must oftenexpend significant amounts of time and effort to select a large numberof transformed bacterial colonies and then to screen them, for example,by restriction fragment length polymorphism (RFLP), to select for thedesired ligation product.

The high-efficiency ligation methods described herein can be used toimprove the specificity of cloning reactions. An exemplary embodiment isdepicted in FIG. 15. A vector can be linearized by any means, such as byrestriction digest at a single site. The ends of the linearized vectorcan be blunt-ended, for example, by a DNA polymerase (e.g., T4 DNApolymerase). The 5′ terminus of a linearized vector can bephosphorylated, e.g., by T4 polynucleotide kinase. The linearized vectorcan be fully or partially denatured, producing at least single-stranded(e.g., frayed) ends or single-stranded linear DNA. High-efficiencyligation using any of the methods as described herein can be performedto ligate a non-palindromic short ssDNA sequence (“ssDNA”) onto the 3′ends of the fully or partially denatured vector. An insert DNA fragmentcan also be blunt-ended and 5′ phosphorylated as described above. Theinsert DNA fragment can be fully or partially denatured. High-efficiencyligation using any of the methods as described herein is performed toinsert a non-palindromic short ssDNA sequence (“ssDNArev”) onto the 3′ends of the fully or partially denatured insert. The modified vector andinsert can then be ligated using standard ligation protocols. BecausessDNA and ssDNArev are non-palindromic sequences, formation of spuriousvector/vector or insert/insert products do not occur, and any ligationwill be between a single vector and a single insert. Alternatively,non-palindromic short ssDNA sequences can be ligated onto 5′ ends of thevector or insert. Such specificity can obviate the need for screeningcolonies by RFLP techniques, and greatly enhance workflow for molecularcloning.

Diagnostic/Therapeutic Applications

The high efficiency ligation methods and kits as described herein havegeneral utility in a number of diagnostic/therapeutic applications. Forinstance, the high efficiency ligation methods of the disclosure are ofgeneral utility for sequence analysis of nucleic acids, which is playingan increasingly important role in the diagnosis, monitoring, andtreatment of diseases. For example, the disclosure methods may beutilized in, e.g., the identification of subjects that have increasedlikelihood of developing a disease, for diagnosing a disease, forimproving accuracy of disease diagnosis, for monitoring the progressionof a disease, for aiding selection of a therapeutic regimen for adisease in a subject, for evaluating disease prognosis in a subject.

Is it understood that there is no limit to the diagnostic/therapeuticapplications or disease types that may benefit from the disclosuremethods. By way of example only, the application of the disclosuremethods to a workflow for monitoring cancer is described herein.

Accordingly, the disclosure provides methods and kits that improve themonitoring and treatment of a subject suffering from a disease. Thedisease can be a cancer, e.g., a tumor, a leukemia such as acuteleukemia, acute t-cell leukemia, acute lymphocytic leukemia, acutemyelocytic leukemia, myeloblastic leukemia, promyelocytic leukemia,myelomonocytic leukemia, monocytic leukemia, erythroleukemia, chronicleukemia, chronic myelocytic (granulocytic) leukemia, or chroniclymphocytic leukemia, polycythemia vera, lymphomas such as Hodgkin'slymphoma, follicular lymphoma or non-Hodgkin's lymphoma, multiplemyeloma, Waldenstrom's macroglobulinemia, heavy chain disease, solidtumors, sarcomas, carcinomas such as, e.g., fibrosarcoma, myxosarcoma,liposarcoma, chondrosarcoma, osteogenic sarcoma, lymphangiosarcoma,mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, coloncarcinoma, colorectal cancer, pancreatic cancer, breast cancer, ovariancancer, prostate cancer, squamous cell carcinoma, basal cell carcinoma,adenocarcinoma, sweat gland carcinoma, sebaceous gland carcinoma,papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma,medullary carcinoma, bronchogenic, carcinoma, renal cell carcinoma,hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonalcarcinoma, Wilms' tumor, cervical cancer, uterine cancer, testiculartumor, lung carcinoma, small cell lung carcinoma, bladder carcinoma,epithelial carcinoma, glioma, craniopharyngioma, ependymoma, pinealoma,hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma,melanoma, neuroblastoma, retinoblastoma, endometrial cancer, ornon-small cell lung cancer.

The subject can be suspected or known to harbor a solid tumor, or can bea subject who previously harbored a solid tumor.

The method can comprise sequencing a set of cancer-related genes from atumor sample isolated from the subject and, optionally, sequencing a setof cancer-related genes from normal cells isolated from the subject. Thetumor sample can be a solid tumor sample. The normal cells can be, e.g.,blood cells isolated from a blood sample from the subject.

Generally, a library of nucleic acids isolated from the subject issequenced. Standard sequencing protocols often comprisepre-amplification of the nucleic acid library to achieve a desired readdepth. However, pre-amplification can introduce amplification bias dueto variable amplification efficiency of individual nucleic acid librarymembers, which can result in over-representation of some genomic regionsand under-representation of other genomic regions (e.g., regions withhigh or low GC content. Pre-amplification can also introduce sequencingerrors due to intrinsic error rates of polymerases used for PCR.Accordingly, the disclosure provides, in some aspects, methods ofsequencing a library of nucleic acids isolated from a biological sourcewithout pre-amplification of the library. In some embodiments thelibrary is not pre-amplified prior to loading onto a sequencer.

Upon sequencing, sequence data from the tumor can be compared tosequence data from normal cells to generate a tumor-specific sequenceprofile. In some embodiments, the tumor-specific sequence profilecomprises mutational status of one or more genes in the set. Themutational status may include SNP or CNV identification. The method canfurther comprise generating a report describing the tumor-specificsequence profile. In some embodiments, the method further compriseschoosing a subset of 2-4 genes known to harbor tumor-specific mutationsfor further monitoring. In other embodiments, the method compriseschoosing a subset of 4-15, 10-30, 20-50, 40-80, 70-125, 100-200, or morethan 200 genes known to harbor tumor-specific mutations for furthermonitoring. In some embodiments, the method comprises selecting theentirety of the set of cancer-related genes for further monitoring. Inother embodiments, the method comprises use of whole genome sequencingfor the purposes of further monitoring. In some embodiments, a samplefrom a solid tumor and a fluid sample (e.g., plasma) are used togenerate two mutational profiles from a subject pre-treatment. Themutational profiles of the two samples can be compared, and a subset ofgenes and/or variants to monitor further can be selected based upon thecomparison. In some cases, a subset of genes and/or variants are chosenbecause they are shared between the two samples.

Sensitive Detection of Amplicons

The present disclosure provides reagents, methods and kits for thesensitive, accurate detection and/or quantification of a mutation in atarget polynucleotide. For example, the present disclosure providesreagents, methods, and kits for probe-based PCR assays thatsubstantially obviate the influence of a probe on efficiency of a PCRreaction. The present disclosure provides reagents, methods, and kitsfor probe-based PCR assays that substantially obviate the influence of aprobe on kinetics of a PCR reaction. Such reagents, methods, and kitscan improve the accuracy and sensitivity of detection as compared toconventional probe-based assays, and thus can have wide applicability inthe life sciences, in genotyping approaches, and indiagnostic/therapeutic approaches.

Aspects of the disclosure relate to probe-based PCR assays in which aprobe does not impact primer annealing or primer extension during PCR.Without wishing to be bound by theory, hybridization of a probe to atemplate nucleic acid during PCR can alter the kinetics of primerextension, and therefore can alter efficiency of the PCR reaction.Furthermore, binding of a probe to a template nucleic acid downstream ofan annealed primer can impact extension of the primer by a polymerase,as sufficient endonuclease activity may be required to displace theannealed probe. Accordingly, described herein are probes designed toobviate probe hybridization during a PCR annealing and/or extensionphase. Such probes can increase the efficiency of PCR amplification.Such probes can minimize extension bias related to probe binding duringa PCR annealing and/or extension phase.

A probe for sensitive detection of amplicons as described herein canprovide highly accurate and sensitive detection of a mutation. Themutation can be a single nucleotide polymorphisms (SNP), insertion,deletion, translocation, and/or copy number variation. Probes of thedisclosure can detect a rare mutation in a heterogeneous sample. A probefor sensitive detection of amplicons can detect a rare mutation in asample having a frequency of less than 50%, 40%, 30%, 20%, 10%, 9%, 8%,7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%,0.2%, 0.1%, 0.05%, 0.01%, 0.005%, 0.001%, 0.0005%, 0.0001%, 0.00005%,0.00001%, 0.000005%, 0.000001%, 0.0000005%, 0.0000001% of the sample.For example, a probe for sensitive detection of amplicons can detect arare SNP in a sample having a frequency of less than 50%, 40%, 30%, 20%,10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%,0.4%, 0.3%, 0.2%, 0.1%, 0.05%, 0.01%, 0.005%, 0.001%, 0.0005%, 0.0001%,0.00005%, 0.00001%, 0.000005%, 0.000001%, 0.0000005%, 0.0000001% of thesample. For example, a probe for sensitive detection of amplicons candetect a rare insertion mutation in a sample having a frequency of lessthan 50%, 40%, 30%, 20%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%,0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, 0.1%, 0.05%, 0.01%, 0.005%,0.001%, 0.0005%, 0.0001%, 0.00005%, 0.00001%, 0.000005%, 0.000001%,0.0000005%, 0.0000001% of the sample. For example, a probe for sensitivedetection of amplicons can detect a rare deletion mutation in a samplehaving a frequency of less than 50%, 40%, 30%, 20%, 10%, 9%, 8%, 7%, 6%,5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%,0.1%, 0.05%, 0.01%, 0.005%, 0.001%, 0.0005%, 0.0001%, 0.00005%,0.00001%, 0.000005%, 0.000001%, 0.0000005%, 0.0000001% of the sample.For example, a probe for sensitive detection of amplicons can detect arare inversion mutation in a sample having a frequency of less than 50%,40%, 30%, 20%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%,0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, 0.1%, 0.05%, 0.01%, 0.005%, 0.001%,0.0005%, 0.0001%, 0.00005%, 0.00001%, 0.000005%, 0.000001%, 0.0000005%,0.0000001% of the sample. For example, a probe for sensitive detectionof amplicons can detect a rare copy number variation of a gene in asample, the rare copy number variation comprising a fold change in copynumber of as low as 1.01-fold.

Also provided herein are methods for the detection of a rare mutation ina sample having a frequency of less than 50%, 40%, 30%, 20%, 10%, 9%,8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%,0.3%, 0.2%, 0.1%, 0.05%, 0.01%, 0.005%, 0.001%, 0.0005%, 0.0001%,0.00005%, 0.00001%, 0.000005%, 0.000001%, 0.0000005%, 0.0000001% of thesample. For example, a method of the disclosure can detect a rare SNP ina sample having a frequency of less than 50%, 40%, 30%, 20%, 10%, 9%,8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%,0.3%, 0.2%, 0.1%, 0.05%, 0.01%, 0.005%, 0.001%, 0.0005%, 0.0001%,0.00005%, 0.00001%, 0.000005%, 0.000001%, 0.0000005%, 0.0000001% of thesample. For example, a method of the disclosure can detect a rareinsertion mutation in a sample having a frequency of less than 50%, 40%,30%, 20%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%,0.6%, 0.5%, 0.4%, 0.3%, 0.2%, 0.1%, 0.05%, 0.01%, 0.005%, 0.001%,0.0005%, 0.0001%, 0.00005%, 0.00001%, 0.000005%, 0.000001%, 0.0000005%,0.0000001% of the sample. For example, a method of the disclosure candetect a rare deletion mutation in a sample having a frequency of lessthan 50%, 40%, 30%, 20%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%,0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, 0.1%, 0.05%, 0.01%, 0.005%,0.001%, 0.0005%, 0.0001%, 0.00005%, 0.00001%, 0.000005%, 0.000001%,0.0000005%, 0.0000001% of the sample. For example, a method of thedisclosure can detect a rare inversion mutation in a sample having afrequency of less than 50%, 40%, 30%, 20%, 10%, 9%, 8%, 7%, 6%, 5%, 4%,3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, 0.1%, 0.05%,0.01%, 0.005%, 0.001%, 0.0005%, 0.0001%, 0.00005%, 0.00001%, 0.000005%,0.000001%, 0.0000005%, 0.0000001% of the sample. For example, a methodof the disclosure can detect a rare copy number variation of a gene in asample, the rare copy number variation comprising a fold change in copynumber of as low as 1.01-fold.

Probes for Sensitive Detection of Amplicons

The disclosure provides probes for probe-based hybridization assays. Theprobe-based hybridization assay can be a probe-based PCR assay, althoughany probe-based hybridization assay is contemplated. In someembodiments, probes are designed to have minimal to zero impact onkinetics and/or efficiency of a PCR amplification reaction. The impactof a probe on kinetics and/or efficiency of a PCR amplification reactioncan relate to an ability of the probe to hybridize or not hybridize to atarget polynucleotide during an annealing and/or extension phase of aPCR reaction. The impact of a probe on kinetics and/or efficiency of aPCR amplification reaction can relate to an ability of the probe tohybridize or not hybridize to a target polynucleotide during PCR thermalcycling. For example, a probe for sensitive detection of amplicons canhave minimal or zero impact on kinetics and/or efficiency of a PCRamplification reaction by not appreciably hybridizing to a templatenucleic acid during an annealing and/or extension phase of the PCRamplification reaction.

The ability of a probe to hybridize or not to a target polynucleotideduring an annealing and/or extension phase of a PCR reaction can relateto a melting temperature (Tm) of the probe. A probe for sensitivedetection of amplicons can have a melting temperature (Tm) that is nothigher than the Tm of PCR primers used in a PCR probe-based assay. Aprobe for sensitive detection of amplicons can have a meltingtemperature (Tm) that is not at least 5-10° C. higher than the averageTm of PCR primers for use in a probe-based PCR assay.

Generally, a probe with a Tm that is lower than a PCR annealingtemperature would be expected to exhibit reduced probe hybridizationduring a PCR annealing phase. A probe for sensitive detection ofamplicons can have a melting temperature (Tm) that is not higher than atemperature of a PCR annealing phase. A probe for sensitive detection ofamplicons can have a melting temperature (Tm) that is lower than atemperature of a PCR annealing phase. A probe with a Tm that is at least5 degrees lower than a PCR annealing temperature can be expected toexhibit significantly reduced hybridization during a PCR annealingphase. Accordingly, the Tm of a probe for sensitive detection ofamplicons can be at least 5° C. less, at least 10° C. less, at least 15°C. less, at least 20° C. less, or more than 20° C. less than atemperature of a PCR annealing phase. A probe for sensitive detection ofamplicons can be a low Tm probe. The Tm of a low Tm probe can be atleast 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,or more than 40° C. less than an annealing temperature of a PCR thermalcycling round. The Tm of a low Tm probe can be about 5-10° C. less,about 10-15° C. less, about 15-20° C. less, about 20-25° C. less, about25-30° C. less than an annealing temperature of a PCR thermal cyclinground. In some cases, a low Tm probe does not hybridize to acomplementary template nucleic acid at an ambient temperature above 55°C., above 60° C., above 65° C., or above 70° C.

A low Tm probe can have a Tm that is below 55° C., below 54° C., below53° C., below 52° C., below 51° C., 50° C., below 49° C., below 48° C.,below 47° C., below 46° C., below 44° C., below 43° C., below 42° C.,below 41° C., below 40° C., below 39° C., below 38° C., below 37° C.,below 36° C., below 35° C., below 34° C., below 33° C., below 32° C.,below 31° C., or below 30° C.

A low Tm probe can be designed to hybridize readily to a templatenucleic acid at about room temperature. Such a probe design can ensuresufficient hybridization of the probe to its target polynucleotide so asto enable adequate detection of the probe. Generally, a probe canhybridize readily to a template nucleic acid at about room temperatureif the Tm of the probe/template duplex is higher than room temperature.Accordingly, a low Tm probe can be designed to have a Tm that is 5° C.higher, 10° C. higher, 15° C. higher, or 20° C. higher, or more than 20°C. higher than room temperature (e.g., a room temperature of 25° C.).Such a Tm can ensure at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 95%, or about 100% of probe hybridization totemplate nucleic acid at room temperature. In some embodiments, a low Tmprobe has a Tm that is above 25° C., above 26° C., above 27° C., above28° C., above 29° C., above 30° C., above 31° C., above 32° C., above33° C., above 34° C., above 35° C., above 36° C., above 37° C., above38° C., above 39° C., above 40° C., above 41° C., above 42° C., above43° C., above 44° C., or above 45° C.

In some embodiments, a low Tm probe has a Tm that is about 30° C., about31° C., about 32° C., about 33° C., about 34° C., about 35° C., about36° C., about 37° C., about 38° C., about 39° C., about 40° C., about41° C., about 42° C., about 43° C., about 44° C., about 45° C., about46° C., about 47° C., about 48° C., about 49° C., or about 50° C. Thelow Tm probe can have a Tm that is 30-35° C., 33-40° C., 36-45° C., or40-50° C. The low Tm probe can have a Tm that is between 30-45° C.

The probe for sensitive detection of amplicons can comprise a detectablemoiety and a quencher moiety. A detectable moiety can be achemiluminescent, radioactive, metal ion, chemical ligand, fluorescent,or colorimetric moiety, or can be an enzymatic group which, uponincubation with an appropriate substrate, provides a chemiluminescent,fluorescent, radioactive, electrical, or colorimetric signal. In somecases, the detectable moiety is a dye. The dye can be a fluorescent dye,e.g., a fluorophore. The fluorescent dye can be a derivatized dye forattachment to the terminal 3′ carbon or terminal 5′ carbon of the probevia a linking moiety. In some embodiments, the dye is derivatized forattachment to a terminal 5′ carbon of the probe via a linking moiety.The quencher can be a fluorescent dye. Alternatively, the quencher maybe a non-fluorescent moiety. Quenching can involve a transfer of energybetween the fluorophore and the quencher. The emission spectrum of thefluorophore and the absorption spectrum of the quencher can overlap.

The probe for sensitive detection of amplicons can be designed accordingto Livak et al., “Oligonucleotides with fluorescent dyes at oppositeends provide a quenched probe system useful for detecting PCR productand nucleic acid hybridization,” PCR Methods Appl. 1995 4: 357-362,which is hereby incorporated by reference.

Reporter-quencher moiety pairs for particular probes can be selectedaccording to, e.g., Pesce et at, editors, Fluorescence Spectroscopy(Marcel Dekker, New York, 1971); White et at, Fluorescence Analysis: APractical Approach (Marcel Dekker, New York, 1970. Exemplary fluorescentand chromogenic molecules that may be used in reporter-quencher pairs,are described in, e.g. Berlman, Handbook of Fluorescence Sprectra ofAromatic Molecules, 2nd Edition (Academic Press, New York, 1971);Griffiths, Colour and Constitution of Organic Molecules (Academic Press,New York, 1976); Bishop, editor, Indicators (Pergamon Press, Oxford,1972); Haugland, Handbook of Fluorescent Probes and Research Chemicals(Molecular Probes, Eugene, 1992); Pringsheim, Fluorescence andPhosphorescence (Interscience Publishers, New York, 1949), which arehereby incorporated by reference.

A wide variety of reactive fluorescent reporter dyes can be used so longas they are quenched by a quencher dye of the disclosure. Thefluorophore can be an aromatic or heteroaromatic compound. Thefluorophore can be, for example, a pyrene, anthracene, naphthalene,acridine, stilbene, benzoxaazole, indole, benzindole, oxazole, thiazole,benzothiazole, canine, carbocyanine, salicylate, anthranilate, xanthenesdye, or coumarin. Exemplary xanthene dyes include, e.g., fluorescein andrhodamine dyes. Exemplary fluorescein and rhodamine dyes include, butare not limited to 6-carboxyfluorescein (FAM),2′7′-dimethoxy-4′5′-dichloro-6-carboxyfluorescein (JOE),tetrachlorofluorescein (TET), 6-carboxyrhodamine (R6G), N,N,N;N′-tetramethyl-6-carboxyrhodamine (TAMRA), 6-carboxy-X-rhodamine (ROX).Suitable fluorescent reporters also include the naphthylamine dyes thathave an amino group in the alpha or beta position. For example,naphthylamino compounds include 1-dimethylaminonaphthyl-5-sulfonate,1-anilino-8-naphthalene sulfonate and 2-p-toluidinyl-6-naphthalenesulfonate, 5-(2′-aminoethyl) aminonaphthalene-1-sulfonic acid (EDANS).Exemplary coumarins include, e.g., 3-phenyl-7-isocyanatocoumarin;acridines, such as 9-isothiocyanatoacridine and acridine orange;N-(p-(2-benzoxazolyl)phenyl) maleimide; cyanines, such as, e.g.,indodicarbocyanine 3 (Cy3), indodicarbocyanine 5 (Cy5),indodicarbocyanine 5.5 (Cy5.5),3-(carboxy-pentyl)-3′-ethyl-5,5′-dimethyloxacarbocyanine (CyA); 1H, 5H,11H, 15H-Xantheno[2,3,4-ij:5,6,7-i′j′]diquinolizin-18-ium, 9-[2 (or4)-[[[6-[2,5-dioxo-1-pyrrolidinyl)oxy]-6-oxohexyl]amino]sulfonyl]-4 (or2)-sulfophenyl]-2,3,6,7,12,13,16,17-octahydro-inner salt (TR or TexasRed); or BODIPY™ dyes. Exemplary fluorescent and quencher moieties aredescribed in, e.g., WO/2005/049849, which is hereby incorporated byreference.

As is known in the art, suitable quenchers are selected according to thefluorescent moiety. Exemplary reporters and quenchers are furtherdescribed in Anderson et al, U.S. Pat. No. 7,601,821, herebyincorporated by reference.

Quenchers are also available from various commercial sources. Exemplarycommercially available quenchers include, e.g., Black Hole Quenchers®from Biosearch Technologies and Iowa Black® or ZEN quenchers fromIntegrated DNA Technologies, Inc.

In some embodiments, The probe for sensitive detection of ampliconscomprises two quencher moieties. Exemplary probes comprising twoquencher moieties include the Zen probes from Integrated DNATechnologies. Such probes comprise an internal quencher moiety that islocated about 9 bases away from the detectable moiety, and generallyreduce background signal associated with traditional reporter/quencherprobes.

Detectable moieties and quencher moieties can be derivatized forcovalent attachment to oligonucleotides via common reactive groups orlinking moieties. Methods for derivatization of detectable and quenchermoieties are described in, e.g., Ullman et al, U.S. Pat. No. 3,996,345;Khanna et al, U.S. Pat. No. 4,351,760; Eckstein, editor,Oligonucleotides and Analogues: A Practical Approach (IRL Press, Oxford,1991); Zuckerman et al, Nucleic Acids Research, 15: 5305-5321 (1987) (3′thiol group on oligonucleotide); Sharma et al, Nucleic Acids Research,19:3019 (1991) (3′ sulfhydryl); Giusti et al, PCR Methods andApplications, 2:223-227 (1993) and Fung et al, U.S. Pat. No. 4,757,141(5′ phosphoamino group via Aminolink™ II available from AppliedBiosystems, Foster City, Calif.); Stabinsky, U.S. Pat. No. 4,739,044 (3′aminoalkylphosphoryl group); Agrawal et al, Tetrahedron Letters,31:1543-1546 (1990) (attachment via phosphoramidate linkages); Sproat etal, Nucleic Acids Research, 15:4837 (1987) (5′ mercapto group); Nelsonet al, Nucleic Acids Research, 17:7187-7194 (1989) (3′ amino group); allof which are hereby incorporated by reference.

In some embodiments, commercially available linking moieties can beattached to an oligonucleotide during synthesis, e.g. linking moietiesavailable through Clontech Laboratories (Palo Alto, Calif.).

By way of example only, rhodamine and fluorescein dyes can bederivatized with a phosphoramidite moiety for attachment to a 5′hydroxyl of an oligonucleotide (see, e.g., Woo et al, U.S. Pat. No.5,231,191; and Hobbs, Jr. U.S. Pat. No. 4,997,928), hereby incorporatedby reference.

In some embodiments, the detectable moiety produces a non-fluorescentsignal. For example, any probe for which hybridization of the probe to atemplate results in a detectable separation of the detectable moietyfrom the quenching moiety may be used. For example, release of thedetectable moiety may be detected electronically, by quantum dotsensing, by luminescence, or chemically (e.g., by a change in pH in asolution resulting from probe hybridization). Likewise, any probe thatbinds to a probe-binding region and for which a change in signal can bedetected upon separation of a detectable moiety from a quencher moietymay be used. For example, molecular beacon probes, MGB probes, Pleiadesprobes, Scorpion probes, or other probes are contemplated for use in thedisclosure.

Molecular beacon probes are described in, e.g., U.S. Pat. Nos. 5,925,517and 6,103,406, which are hereby incorporated by reference. Molecularbeacon probes generally refer to hairpin or bimolecular oligonucleotideprobes. A hairpin molecular beacon probe can comprise a detectablemoiety at one end of the hairpin, a quencher moiety at the other end ofthe hairpin, wherein the hairpin comprises a template-binding region.Without wishing to be bound by theory, hybridization of the templatebinding region to a template can separate the hairpin structure of theprobe and separate the detectable moiety from the quencher moiety,enabling detection of the detectable moiety. A bimolecular beacon probecan comprise two oligonucleotide strands having sequences that arecomplementary to each other at the 5′ end and 3′ end, respectively. Thecomplementary sequences can each be conjugated to a detectable moietyand a quencher moiety, respectively. Each of the two oligonucleotidestrands can further comprise a template binding sequence that bind todifferent regions of a target sequence. The formation of Watson-Crickbonding between the complementary strands can result in the formation ofa Y structure and bring the detectable moiety in close proximity withthe quencher moiety, resulting in quenching of the detectable moiety.Hybridization of the template binding sequences to the targetpolynucleotide can break the duplex between the complementary sequences,thus separating the detectable moiety from the quencher moiety andresulting in dequenching of the detectable moiety,

MGB probes are described in, e.g., U.S. Pat. Nos. 7,582,739; 7,381,818;6,492,346; 6,321,894; 6,303,312; and 6,221,589; which are herebyincorporated by reference. MGB probes refer to oligonucleotide probescomprising a minor groove binder (MGB). The term “minor groove binder”,as used herein, generally refers to a molecule capable of binding withinthe minor groove of double-stranded DNA, double-stranded RNA, DNA-RNAhybrids, DNA-PNA hybrids, hybrids in which one strand is a PNA/DNAchimera, and/or polymers containing purine and/or pyrimidine basesand/or their analogues which are capable of base-pairing to form duplex,triplex or higher order structures comprising a minor groove. The MGBdomain of the probe can stabilize a duplex formed between the probe andits corresponding template polynucleotide. Incorporation of an MGB canenable the use of short probes, can enhance the stability of aprobe/template duplex, and retain the specificity of an allele-specificprobe. An MGB probe can have an MGB ligand and a quencher located at the3′-end of the probe, and a fluorophore is attached at the 5′-end of theprobe. Alternatively, an MGB probe can have an MGB ligand and quencherlocated at the 5′-end of the probe and a fluorophore at the 3′-end ofthe probe.

Pleiades probes are described in US Patent Publication Nos. 20046727356,20077205105 and 20090111100, hereby incorporated by reference. Pleiadesprobes generally refers to MGB probes that comprise a detectable moiety,e.g., a fluorophore in close proximity to an MGB at a first end of theprobe, and a quencher moiety at a second end of the probe. Thedetectable moiety can be quenched by the quencher moiety, andadditionally can be further quenched by the MGB.

Probes for sensitive detection of amplicons can be designed to have alength. The length of a probe for sensitive detection of amplicons canbe sufficiently long that the detectable moiety and quencher are inclose enough proximity so as to quench the detectable moiety when theprobe is free in solution (e.g., in an unhybridized state). By way ofexample only, a probe for sensitive detection of amplicons can, in itsunhybridized state, exhibit less than 50%, less than 40%, less than 30%,less than 20%, less than 10%, less than 5%, less than 4%, less than 3%,less than 2%, less than 1%, less than 0.5%, less than 0.1%, less than0.01%, less than 0.001%, or less than 0.0001% fluorescence as comparedto the probe in a fully hybridized state. Without wishing to be bound bytheory, hybridization of such probes can cause the probes to lose theirrandom coiled state and fully stretch out, increasing the distancebetween a probe's detectable moiety and quencher moiety, therebyactivating the detectable moiety. Such hybridization-dependentactivatable probes are described in, e.g., U.S. Pat. No. 6,030,787, U.S.Pat. No. 5,723,591 U.S. Pat. No. 7,485,442 and U.S. application Ser. No.10/165,410), which are hereby incorporated by reference. The detectablemoiety and the quencher can be spaced at least 7, 8, 9, 10, 11, 12, 13,14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, ormore than 30 nucleotides apart. The detectable moiety and the quenchercan be spaced about 7-10, 9-15, 12-20, 20-30, or more than 30nucleotides apart. The overall length of the probe can be 7, 8, 9, 10,11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,29, 30, or more than 30 nucleotides. The overall length of the probe canbe about 7-12, 12-20, 20-30, or more than 30 nucleotides.

In some embodiments, the probe comprises a nucleotide with a Tmenhancing base. The probe can comprise a Superbase™, a locked nucleicacid, or bridge nucleic acid. Exemplary locked or bridge nucleic acidsare described herein.

Probes can be designed to selectively hybridize to a targetpolynucleotide of interest. Probes can be designed to have at least 50,55, 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, or 100%complementarity to a target polynucleotide.

In some embodiments, a probe can be designed to have a length less than15, 14, 13, 12, 11, or 10 nucleotides. In some embodiments, such a probehas a GC content that is more than 25%, 30%, 35%, 40%, 45%, 50%, 55%,60%, 65%, 70%, 75%, or up to 80%. In some embodiments, a probe having alength less than 15, 14, 13, 12, 11, or 10 nucleotides comprises a GCcontent greater than 40%, such as, e.g., 40-80%. In some cases, a probehaving a length less than 15, 14, 13, 12, 11, or 10 nucleotides and a GCcontent that is more than 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%,70%, 75%, or up to 80% does not comprise a modified nucleotide such as abridge or locked nucleotide. In other embodiments, a probe having alength less than 15, 14, 13, 12, 11, or 10 nucleotides comprises a GCcontent less than 40%, 35%, 30%, 25%. In particular embodiments, such aprobe further comprises a modified nucleotide. In some cases, themodified nucleotide is a locked or bridge nucleotide. In some cases,such a probe comprises a peptide nucleic acid. In such cases, a probedoes not necessarily comprise a modified nucleotide.

In other embodiments, a probe is designed to have a length of 15 ormore, 16, or more, 17 or more, 18 or more, 19 or more, 20 or more, 21 ormore, 22 or more, 23 or more, 24 or more, 25 or more, or 30 or morenucleotides. In particular embodiments, such probes have a GC contentthat is less than 80%. For example, such probes can have a GC contentthat is less than 80%, less than 75%, less than 70%, less than 65%, lessthan 60%, less than 55%, less than 50%, less than 45%, less than 40%,less than 35%, or less than 30%. In particular embodiments, a probe forsensitive detection of amplicons having a length of 15 or more, 16, ormore, 17 or more, 18 or more, 19 or more, 20 or more, 21 or more, 22 ormore, 23 or more, 24 or more, 25 or more nucleotides also has a GCcontent that is about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, or70%.

A probe for sensitive detection of amplicons can be designed for highlysensitive allelic discrimination, e.g., can be an allele-specific probe.Such probes can be designed to partially or fully overlay a locussuspected of harboring a mutation such as, e.g., a SNP, insertion,deletion, or inversion. An allele-specific probe can be designed to beperfectly matched (e.g., perfectly complementary) to a template nucleicacid containing a specific allele at a locus, but to comprise a mismatchto any other allele of the locus. The mismatch can be a mismatch of 1,2, 3, 4, 5, or more than 5 nucleotides. In some embodiments, anallele-specific probe can form a duplex with a perfectly templatenucleic acid containing a specific allele at a locus. In someembodiments, the probe/perfectly matched template duplex has a first Tm.The allele-specific probe can also form a duplex with a mismatchedtemplate nucleic acid containing a different allele at the same locus.In some embodiments, the probe/mismatched template duplex has a secondTm. The difference between the first and second Tm (e.g., the bindingpenalty of the mismatch) can be at least 1% of the total binding energyof the probe to the template.

A probe can be designed for the sensitive and accurate detection of atarget polynucleotide that is not suspected of harboring a mutation suchas a SNP, insertion, deletion, or inversion. For example, the targetpolynucleotide may be suspected of having a copy number variation. Insuch cases, a probe is not necessarily designed to have a mismatch tothe target polynucleotide. In some cases, the probe is designed to beperfectly matched to the target polynucleotide.

Probes can be designed to not hybridize to its target template nucleicacid during PCR. PCR generally involves repeated rounds of thermalcycling. Probes can be designed to not hybridize during the repeatedrounds of thermal cycling. A user may set thermal cycling parameters tocomprise repeated cycles, the repeated cycles comprising a denaturationstep, an annealing step, and an extension step. In some embodiments therepeated cycles do not include any temperature step below 50° C.Following the repeated cycles, a user may also include a final extensionstep. In some embodiments, the final extension step is not below 50° C.In particular embodiments, the final extension step is about 65-75° C.Following the repeated cycles, a user may include a final extension stepand/or a cooling step wherein the reaction temperature is reduced tobelow 45° C., below 40° C., below 35° C., below 30° C., or at or below25° C. In some embodiments, the disclosure probe hybridizes to itstarget template nucleic acid during the cooling step. In such cases, auser may perform endpoint detection of target amplicons. In someembodiments, the cooling step may comprise a controlled cooling stepwherein a reaction temperature cools at a constant rate. The constantrate may be 0.01° C./second, 0.02° C./second, 0.03° C./second, 0.04°C./second, 0.05° C./second, 0.06° C./second, 0.07° C./second, 0.08°C./second, 0.09° C./second, 0.10° C./second, 0.2° C./second, 0.3°C./second, 0.4° C./second, 0.5° C./second, 0.6° C./second, 0.7°C./second, 0.8° C./second, 0.9° C./second, or 1° C./second. In suchcases, a user may note a temperature at which fluorescence is detected.In some cases, the temperature at which fluorescence is detected mayprovide information to a user as to a mutational status of a targetnucleic acid.

Probes can be designed to a region in the 5′ end of a primer that doesnot bind a target region.

Probe hybridization to a target sequence is sufficient to effectsufficient separation of the fluorophore from the quencher. Improvementto the separation of the fluorophore from the quencher can be determinedby the number of helical turns that exist between the two moieties uponprobe binding. Further improvement the separation of the fluorophore andthe quench can be obtained by using a sequence-dependent model thatpredicts an improved LoT_(m) probe for any sequence. A set of sequencescan be created with a fluorophore and quencher such that the nearestneighbor pairs of dinucleotides are equally represented. Theirfractional annealing versus temperature with their complement can bemonitored fluorometrically or using a real-time instrument. The measuredchange of fluorescence between bound and free states can then be relatedto the linear combination of dinucleotides to create a predictive modelof DNA conformation and maximal delta fluorescence.

Alternatively, a user may include a cooling step during repeatedcycling. For example, a repeated cycle may include a denaturation,annealing, extension, and a cooling step. In some embodiments, thecooling step of the repeated cycles comprises reducing the reactiontemperature to below 45° C., below 40° C., below 35° C., below 30° C.,or at or below 25° C. In some embodiments, the disclosure probehybridizes to its target template nucleic acid during the cooling step.In such cases, a user may perform real-time detection of targetamplicons.

Reaction Mixture for Sensitive Detection of Amplicons

In another aspect, the disclosure provides a reaction mixture forsensitive detection of amplicons. The reaction mixture for sensitivedetection of amplicons can comprise components for carrying out a PCRreaction. The reaction mixture for sensitive detection of amplicons cancomprise components necessary to amplify at least one amplicon fromnucleic acid template molecules. The reaction mixture for sensitivedetection of amplicons may comprise nucleotides (dNTPs), a polymerase,one or more primers, and an disclosure probe. The reaction mixture forsensitive detection of amplicons may further comprise a Tris buffer, amonovalent salt, and one or more cation. The one or more cations can beMg²⁺ and/or Mn²⁺. In some embodiments, the reaction mixture forsensitive detection of amplicons comprises Mg²⁺ and Mn²⁺. Theconcentration of each component can be optimized by an ordinary skilledartisan. In some embodiments, the reaction mixture for sensitivedetection of amplicons also comprises additives including, but notlimited to, non-specific background/blocking nucleic acids (e.g., salmonsperm DNA), biopreservatives (e.g. sodium azide), PCR enhancers (e.g.Betaine, Trehalose, etc.), and inhibitors (e.g. RNAse inhibitors). Insome embodiments, a nucleic acid sample is admixed with the reactionmixture for sensitive detection of amplicons. Accordingly, in someembodiments the reaction mixture for sensitive detection of ampliconsfurther comprises a nucleic acid sample.

Primers used in the present disclosure can comprise a template bindingregion that is designed to hybridize to a target polynucleotide ofinterest. Primers used in the present disclosure are generallysufficiently long to prime the synthesis of extension products in thepresence of the agent for polymerization. The exact length andcomposition of a primer can depend on many factors, includingtemperature of the annealing reaction, source and composition of theprimer, and ratio of primer:probe concentration. The primer length canbe, for example, about 5-100, 10-50, 15-30, or 18-22 nucleotides,although a primer may contain more or fewer nucleotides.

Primers used in the present disclosure can also comprise a probe-bindingregion. Exemplary probe-binding regions are described herein.

Primers used in the present disclosure can further comprise a barcodesequence. The term “barcode sequence” as used herein, generally refersto a unique sequence of nucleotides that can encode information about anassay. In some embodiments, a barcode sequence encodes informationrelating to the identity of an interrogated allele, identity of a targetpolynucleotide or genomic locus, identity of a sample, a subject, or anycombination thereof. In some embodiments, a barcode sequence does nothybridize to the template nucleic acid. A barcode sequence can, forexample, be designed to avoid significant sequence similarity orcomplementarity to known genomic sequences of an organism of interest.Such unique sequences can be randomly generated, e.g., by a computerreadable medium, and selected by BLASTing against known nucleotidedatabases such as, e.g., EMBL, GenBank, or DDBJ. The barcode sequencecan also be designed to avoid secondary structure. A barcode sequencemay be at a 3′-end or more preferably at a 5′ end of a primer. Barcodesequences may vary widely in size and composition; the followingreferences provide guidance for selecting sets of barcode sequencesappropriate for particular embodiments: Brenner, U.S. Pat. No.5,635,400; Brenner et al, Proc. Natl. Acad. Sci., 97: 1665-1670 (2000);Shoemaker et al, Nature Genetics, 14: 450-456 (1996); Morris et al,European patent publication 0799897A1; Wallace, U.S. Pat. No. 5,981,179,all of which are hereby incorporated by reference. In particularembodiments, a barcode sequence may have a length of about 4 to 36nucleotides, about 6 to 30 nucleotides, or about 8 to 20 nucleotides.The barcode sequence can have any length. In some embodiments, primerscan comprise a probe-binding region as described herein.

Primers and/or probes may be prepared by any suitable method. Methodsfor preparing oligonucleotides of specific sequence are known in theart, and include, for example, cloning and restriction of appropriatesequences and direct chemical synthesis. Chemical synthesis methods mayinclude, for example, the phosphotriester method described by Narang etal., 1979, Methods in Enzymology 68:90, the phosphodiester methoddisclosed by Brown et al., 1979, Methods in Enzymology 68:109, thediethylphosphoramidate method disclosed in Beaucage et al., 1981,Tetrahedron Letters 22:1859, and the solid support method disclosed inU.S. Pat. No. 4,458,066. The above references are hereby incorporated byreference.

Primers and/or probes can be obtained from commercial sources such as,e.g., Operon Technologies, Amersham Pharmacia Biotech, Sigma, IDTTechnologies, and Life Technologies. The primers can have an identicalor similar melting temperature. The lengths of the primers can beextended or shortened at the 5′ end or the 3′ end to produce primerswith desired melting temperatures. Also, the annealing position of eachprimer pair and/or each probe can be designed such that the sequenceand, length of the primer pairs and/or probes yield the desired meltingtemperature.

The melting temperature of the primers and/or probes can be determinedempirically, e.g., by performing a melting curve analysis. Methods ofperforming melting curve analysis to empirically determine Tm of aprimer and/or probe are known to those of skill in the art. The meltingtemperature of the primers and/or probes can also be predicted. By wayof example only, the simplest equation for predicting the meltingtemperature of primers smaller than 25 base pairs is the Wallace Rule:

(Td=2(A+T)+4(G+C)).

Another method for calculating the Tm of an oligonucleotide is thenearest-neighbor method. The nearest-neighbor method generallyincorporates certain variables such as salt concentration and DNAconcentration. This method can incorporate reaction mixture conditionstypically found in PCR applications, such as, e.g., 50 mM monovalentsalt and 0.5 μM primer. Generally, the nearest-neighbor equation for DNAand RNA-based oligonucleotides is:

Tm=(1000ΔH)/A+ΔS+R ln(C/4)−273.15+16.6 log [Na+], wherein

-   -   ΔH (Kcal/mol) is the sum of the nearest-neighbor enthalpy        changes for hybrids, A is a constant containing corrections for        helix initiation, ΔS is the sum of the nearest-neighbor entropy        changes, R is the Gas Constant (1.99 cal K-lmol-l), and C is the        concentration of the oligonucleotide.

The ΔH and ΔS values for nearest-neighbor interactions of DNA and RNAare shown in Table 1 (below).

TABLE 1 Thermodynamic parameters for nearest-neighbor melting pointformula. DNA RNA Interaction ΔH ΔS ΔH ΔS AA/TT −9.1 −24 −6.6 −18.4 AT/TA−8.6 −23.9 −5.7 −15.5 TA/AT −6 −16.9 −8.1 −22.6 CA/GT −5.8 −12.9 −10.5−27.8 GT/CA −6.5 −17.3 −10.2 −26.2 CT/GA −7.8 −20.8 −7.6 −19.2 GA/CT−5.6 −13.5 −13.3 −35.5 CG/GC −11.9 −27.8 −8 −19.4 GC/CG −11.1 −26.7−14.2 −34.9 GG/CC −11 −26.6 −12.2 −29.7 Initiation 0 −10.8 0 −10.8

Another equation that is generally used for predicting the Tm of a DNAoligonucleotide which is longer than, e.g., 50 bases at a pH between,e.g., 5.0 to 9.0 is the % GC method:

Tm=81.5+16.6 log [Na+]+41(X _(G) +X _(C))−500/L−0.62F

wherein [Na+] is the molar concentration of monovalent cations (in thiscase Na+), X_(G) and X_(C) are the mole fractions of G and C in theoligonucleotide, L is the length of the shortest strand in the duplex,and F is the percentage of formamide in the hybridization solution.

Those of skill in the art will understand that Tm can also depend onfactors other than the oligonucleotide sequence. Tm can depend on, e.g.,salt concentration of a reaction mixture, buffer type used in a reactionmixture, the relative concentration of the primer or probe relative tothe template concentration, and other factors. Computer programs canalso be used to design primers, including but not limited to ArrayDesigner Software (Arrayit Inc.), Oligonucleotide Probe Sequence DesignSoftware for Genetic Analysis (Olympus Optical Co.), NetPrimer,PrimerExpress, and DNAsis from Hitachi Software Engineering. The Tm(melting or annealing temperature) of each primer can be calculatedusing software programs such as, e.g., Oligo Design, available fromInvitrogen Corp, BioMath Calculators from Promega(www.promega.com/techserv/tools/biomath/calc11.htm), Tm Calculator fromNew England Biolabs, OligoAnalyzer from Integrated DNA Technologies,among others.

The reaction mixture for sensitive detection of amplicons can comprisereaction components for performing linear amplification. Generally,during linear amplification, only one strand of a double-strandedtemplate nucleic acid is amplified per cycle, resulting insingle-stranded extension products. To enable linear amplification, areaction mixture can, for example, comprise only one primer per targetpolynucleotide.

Alternatively, the reaction mixture for sensitive detection of ampliconscan be configured for exponential amplification. Generally, duringexponential amplification, both strands of a double-stranded templatenucleic acid are amplified per cycle, resulting in the generation of2^(n) copiesof a target polynucleotide, wherein n is the number ofcycles in a PCR reaction. To enable exponential amplification, areaction mixture can comprise a forward and reverse primer per targetpolynucleotide. Typically, for exponential amplification, the forwardand reverse primers are present in the reaction mixture at a ratiobetween 1:3-3:1 ratio, between 1:2-2:1 ratio, preferably between 2:3-3:2ratio, more preferably between 3:4-4:3 ratio, or yet even morepreferably about a 1:1 ratio.

In some cases, the reaction mixture for sensitive detection of ampliconscan be configured for exponential amplification followed by linearamplification. In such cases, one primer of a forward/reverse primer setcan be present in an excess concentration or amount as compared to theother primer of the forward/reverse primer set. In some embodiments, theconcentration of the excess primer is at least 2×, 3×, 4×, 5×, 6×, 7×,8×, 9×, 10× the concentration of the limiting primer. In someembodiments, the concentration of the excess primer is about 2-10×,5-50×, 20-100×, 50-500×, 100-1000×, 500-2000×, 1000-5000×, 2000-10000×,or more than 10000× the concentration of the limiting primer. In suchcases, exponential amplification will proceed until exhaustion of thelimiting primer, upon which linear amplification proceeds using theexcess primer remaining in the reaction mixture or discrete reactionvolume. Without wishing to be bound by theory,exponential-followed-by-linear amplification ensures (1) that enoughamplification products are generated as to result in a detectablesignal, and (2) that the PCR reaction products are predominantlysingle-stranded extension products which, upon cooling the reactiontemperature to below, e.g., 50° C., are available to bind to a detectionprobe instead of, e.g., to its reverse complement strand. Accordingly,in some embodiments, upon termination of PCR thermal cycling, singlestranded extension products account for at least 5%, 10%, 20%, 30%, 40%,50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or more that 95% ofthe total amount of reaction products. In some embodiments singlestranded extension products do not account for at least 50% of the totalamount of reaction products. In some embodiments, upon termination ofPCR thermal cycling, at least 5%, at least 10%, at least 20%, at least30%, at least 40%, at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%,95%, or more that 95% of the PCR extension products are extensions ofthe excess primer. In such cases, linear amplification can be performedfollowing exponential amplification without a user adding or removingcomponents from the reaction mixture.

The reaction mixture for sensitive detection of amplicons can comprise apolymerase. In some embodiments, the polymerase is a DNA polymerase. Inparticular embodiments, the DNA polymerase is a thermostable polymerase.The thermostable polymerase may originate from a thermophilic bacteriumor from Archaea. Exemplary thermostable polymerases include, but are notlimited to, Thermus aquaticus (Taq polymerase), Pyrococcus furiosus (Pfupolymerase), Vent® DNA Polymerase gene from Thermococcus litoralis, DeepVent™ polymerase from Pyrococcus sp., Platinum® Pfx polymerase, Tfipolymerase from Thermus filiformis, Pwo polymerase, chimeric DNApolymerases comprising a DNA binding protein (e.g., Phusion, iProof),topoisomerase. In some embodiments, the polymerase is capable ofisothermal amplification. The polymerase can be, e.g., Bst DNApolymerase, Bca DNA polymerase, E. coli DNA polymerase I, the Klenowfragment of E. coli DNA polymerase I, Taq DNA polymerase, T7 DNApolymerase (Sequenase).

In some embodiments, the DNA polymerase comprises 5′→3′ exonucleaseactivity. As used herein, “5′→3′ nuclease activity” or “5′ to 3′nuclease activity” can refer to an activity of a template-specificnucleic acid polymerase whereby nucleotides are removed from the 5′ endof an oligonucleotide in a sequential manner. DNA polymerases with 5′→3′exonuclease activity are known in the art and include, e.g., DNApolymerase isolated from Thermus aquaticus (Taq DNA polymerase). In someembodiments, the DNA polymerase lacks 3′→5′ exonuclease activity.Exemplary DNA polymerases lacking 3′→5′ exonuclease activity include,but are not limited to BST DNA polymerase I, BST DNA polymerase I (largefragment), Taq polymerase, Streptococcus pneumoniae DNA polymerase I,Klenow Fragment (3′→5′ exo-), PyroPhage® 3173 DNA Polymerase,Exonuclease Minus (Exo-) (available from Lucigen), T4 DNA Polymerase,Exonuclease Minus (Lucigen). In some embodiments, the DNA polymerase isa recombinant DNA polymerase that has been engineered to lackexonuclease activity.

In some embodiments, a reaction mixture for sensitive detection ofamplicons can comprise multiple primers and probes for multiplexdetection. By way of example only, a reaction mixture for sensitivedetection of amplicons reaction mixture can comprise a primer/probe set.In some embodiments, a primer/probe set comprises a common forwardprimer and optionally a reverse primer designed to amplify a targetpolynucleotide suspected of harboring a mutation at a locus, and furthercomprises a plurality of probes, wherein each probe is specific for aspecific allele of the locus. Each probe in the primer/probe set canfurther comprise a distinct detectable moiety that is detectablydistinct from any other detectable moiety in the reaction mixture. Byway of other example, a reaction mixture can comprise a plurality ofprimer/probe sets, wherein each primer/probe set is specific for adifferent target polynucleotide, e.g., a different locus. In someembodiments, one or both primers comprise a probe binding site, and thelow T_(m) probe binds to the probe binding site on either the forward orreverse primer, or both.

In some embodiments, the primer/probe set comprises a common reverseprimer, a first allele-specific forward primer, and at least a secondallele-specific forward primer designed to amplify a targetpolynucleotide suspected of harboring a mutation at a locus. The forwardprimers can each comprise a template binding region. The templatebinding region may overlay a mutation. The forward primers can eachfurther comprise a probe-binding region (e.g., barcode region). One ofthe forward primers can be a wild-type specific forward primer that iscomplementary to the wild-type allele at the site that overlays themutation. The wild-type specific forward primer can further comprise awild-type barcode region which does not generally hybridize to atemplate nucleic acid. The wild-type barcode region may contain awild-type barcode sequence that specifically hybridizes a wild-type lowTm probe, but does not substantially hybridize a mutant low Tm probe.One of the forward primers can be a mutant-specific forward primer thatis complementary to the mutant allele at the site that overlays themutation. The mutant specific forward primer can further comprise amutant barcode region which does not generally hybridize to a templatenucleic acid. The mutant barcode region may contain a mutant barcodesequence that specifically hybridizes a mutant low Tm probe, but doesnot substantially hybridize to the wild-type low Tm probe. The forwardprimers (wild-type and mutant forward primers) may further comprise adeliberate mismatch nucleotide adjacent to or within 1-3 nucleotidesfrom the nt that overlays the mutation. However, in some cases, theforward primers do not further comprise a deliberate mismatch nucleotideadjacent to or within 1-3 nucleotides from the nt that overlays themutation. The primer/probe set may further comprise a wild-type low Tmprobe and a mutant low Tm probe. The wild-type low Tm probe may bedesigned to specifically hybridize to the wild-type barcode region. Themutant low Tm probe may be designed to specifically hybridize to themutant barcode region. The wild-type and mutant low Tm probes maycomprise spectrally distinct fluorophores. The primer/probe set mayfurther comprise a common reverse primer.

Reverse primer can be downstream of the forward primer. The reverseprimer can designed to bind to a target region 0, 1, 2, 3, 5, 10, 20,30, 50 bases away from the forward primer.

The reverse primer can be complementary to a pdo ligated to the 3′-endof the DNA.

Methods for Sensitive Detection of Amplicons

FIG. 16 depicts an exemplary workflow 1600 for a method for thesensitive detection of amplicons, comprising a first step 1610 ofperforming a probe-based PCR assay in a reaction mixture, wherein theprobe-based PCR assay comprises thermal cycling, wherein the probe isdesigned to have minimal to zero impact on kinetics or efficiency of thePCR amplification reaction. In some embodiments, the probe does nothybridize to a template nucleic acid during the PCR reaction. In someembodiments, the oligonucleotide probe hybridizes to a template nucleicacid after termination of a PCR reaction. Termination of a PCR reactioncan include a next step 1620 of allowing the reaction mixture to cool toa temperature that enables hybridization of the probe to a targetpolynucleotide. In some embodiments, probe hybridization enablesdetection of the hybridized probe. The method can further comprise anext step 1630 of detecting the probe.

Amplification

In some embodiments amplification is carried out utilizing a nucleicacid polymerase. In some embodiments, the nucleic acid polymerase is aDNA polymerase. In particular embodiments, the DNA polymerase is athermostable DNA polymerase. In other embodiments, the DNA polymerase iscapable of isothermal amplification. Exemplary DNA polymerases aredescribed herein.

In some embodiments, the reaction mixture is subjected to a PCRamplification reaction. PCR amplification can involve repeated thermalcycling. Thermal cycling can be carried out as an automated process. Theautomated process may be carried out using a PCR thermal cycler.Commercially available thermal cycler systems include systems fromBio-Rad Laboratories, Life Technologies, Perkin-Elmer, among others.

The thermal cycling can comprise cycling through the repeated steps ofdenaturation, primer annealing and primer extension. Temperatures andtimes for the three steps can be, e.g., 90-100° C. for 5 seconds or morefor denaturation, 50-65° C. for 10-60 sec for the annealing phase, and50-75° C. for 15-120 sec for primer extension. In some embodiments,primer annealing and primer extension are combined in a singletemperature step (e.g., 60° C.). Prior to thermal cycling, a PCRreaction can include a “hot-start” initiation phase to activate apolymerase. The “hot-start” phase can comprise heating a reactionmixture to 90-100° C. Following the repeated cycles, a user may alsoinclude as part of the PCR reaction a final extension step. The finalextension step can comprise a reaction temperature of 50-75° C. for,e.g., 5, 6, 7, 8, 9, 10, or more than 10 minutes.

Thermal cycling parameters can be set by a user. In some embodiments, auser sets thermal cycling parameters so as to enable endpoint detectionof a low Tm probe. For example, a user can set thermal cyclingparameters such that the repeated cycles do not include any temperaturestep below 50° C. Such parameters can minimize hybridization of the lowTm probe during the PCR reaction. Following the repeated cycles, a usermay also include a final extension step. In some embodiments, the finalextension step is not below 50° C. In particular embodiments, the finalextension step is about 50-75° C. Following the repeated cycles, a usermay include a final extension step and/or a cooling step wherein thereaction temperature is reduced to below 45° C., below 40° C., below 35°C., below 30° C., or at or below 25° C. In some embodiments, the low Tmprobe hybridizes to its target template nucleic acid during the coolingstep. In such cases, a user may perform endpoint detection of targetamplicons. In some embodiments, the cooling step may comprise acontrolled cooling step wherein a reaction temperature cools at aconstant rate. The constant rate may be as described herein. In suchcases, a user may note a temperature at which fluorescence is detected.In some cases, the temperature at which fluorescence is detected mayprovide information to a user as to a mutational status of a targetnucleic acid.

FIG. 17 depicts an exemplary workflow 1700 for an endpoint detectionmethod of the disclosure, comprising a first step 1710 of conducting aPCR reaction in a plurality of reaction volumes. In some embodiments,one or more of the reaction volumes comprise a probe for sensitivedetection of amplicons (e.g., a low Tm probe) comprising a fluorescentmoiety and a quencher moiety. In some embodiments, the probe isconfigured to remain unhybridized during a PCR annealing or extensionphase. In some embodiments the PCR thermal cycling phases do notcomprise any temperature phase that is less than 5° C. above the Tm ofthe low Tm probe. In some embodiments, the PCR reaction results in thegeneration of amplification products. In a next step 1720, the reactionvolumes are cooled to a temperature that enables hybridization of thelow Tm probe to the amplification products. In some embodiments, theselective hybridization of the low Tm probe to its target polynucleotideallows dequenching of fluorescence emission from the detectable moietyof the probe. In a next step 1730, the reaction volumes havingdetectable fluorescence are enumerated.

Alternatively, a user may introduce a cooling step into the repeatedthermal cycles. For example, a repeated cycle may include a denaturationstep, annealing step, extension step, and a cooling step. In anotherexample, a repeated cycle may include a first denaturation step,annealing step, extension step, second denaturation step, and a coolingstep. In some embodiments, the cooling step of the repeated cyclescomprises reducing the reaction temperature to below 45° C., below 40°C., below 35° C., below 30° C., or at or below 25° C. In someembodiments, the low Tm probe hybridizes to its target template nucleicacid during the cooling step. In such cases, a user may performreal-time PCR detection of target amplicons by detecting a level ofhybridized probe during each cooling step. As used herein, “real-timePCR” refers to PCR methods wherein an amount of detectable signal ismonitored with each cycle of PCR. In some embodiments, a cycle threshold(Ct) wherein a detectable signal reaches a detectable level isdetermined. Generally, the lower the Ct value, the greater theconcentration of the interrogated allele. Systems for real-time PCR areknown in the art and include, e.g., the ABI 7700 and 7900HT SequenceDetection Systems (Applied Biosystems, Foster City, Calif.). Theincrease in signal during the exponential phase of PCR can provide aquantitative measurement of the amount of templates containing themutant allele.

FIG. 18 depicts an exemplary method of the disclosure comprisingreal-time detection, comprising thermal cycling a reaction mixture 1801comprising template nucleic acid 1802, forward and reverse primers F1and R1, respectively, a probe 1803 for sensitive detection of ampliconscomprising a fluorescence moiety F and quencher moiety Q, dNTPs (notshown), and any other reaction components necessary for carrying out aPCR reaction (e.g., a polymerase, not shown). In some embodiments, thefluorescent moiety of the probe when the probe is in an unhybridizedstate is quenched (denoted by Fi). A PCR reaction may or may not beinitiated by a “hot-start” (not shown). Thermal cycling may be initiatedfollowing the “hot-start”. The repeated thermal cycles can comprise afirst denaturation phase 1810 which denatures the double-strandedtemplate nucleic acid into single-stranded template strands 1811 and1812. The first denaturation phase can be followed by a primer annealingphase 1820 in which the forward and reverse primers F1 and R1 areallowed to hybridize to their target strands 1811 and 1812. During theannealing phase, a probe 1803 for sensitive detection of ampliconsgenerally does not exhibit significant hybridization to its targettemplate. The annealing phase can be followed by an extension phase1830, wherein a polymerase extends the F1 and R1 primers, therebycreating two copies of the target polynucleotide 1831 and 1832. Duringthis phase, a probe 1803 for sensitive detection of amplicons wouldgenerally not hybridize to a template nucleic acid. The extension phasecan be followed by a second denaturation phase 1840 which denatures thedouble-stranded template nucleic acid into single-stranded templatestrands 1841. The second denaturation phase can be followed by a coolingphase e.g., cooling to below 50° C. or cooling to about roomtemperature. Cooling the reaction mixture can enable hybridization ofthe low Tm probe to a target polynucleotide. Hybridization of the probecan result in full extension of the probe and release the detectablemoiety from the influence of the quencher moiety (detectable moietydepicted as *F). The detectable moiety can thus be detected during eachthermal cycle. In some embodiments, the probe 1803 is a low Tm probe. Insome cases, the low Tm probe has a melting point below 50° C. In somecases, the low Tm probe has a melting point of between about 35° C. to45° C. In some embodiments, the probe 1803 is not a low Tm probe. Insome cases, the probe 1803 has a melting point greater than 50° C.

In some embodiments, repeated cycles of denaturation, primer annealing,and primer extension result in the accumulation of amplicons comprisinga target polynucleotide. The amplicons may be single or double stranded.Sufficient cycles can be run to accumulate an amount of ampliconscomprising the target polynucleotide sufficient to enable hybridizationof detectable levels of probe. The resulting detectable signal can be 2,5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600,700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, or10000-fold greater or several orders of magnitude greater thanbackground signal.

In some embodiments, the PCR amplification reaction is an exponentialamplification reaction. An exemplary embodiment of a method involvingexponential amplification is depicted in FIG. 19. A starting reactionmixture or volume 1901 can comprise a template nucleic acid 1902, whichmay be a double-stranded template nucleic acid, a probe 1903 forsensitive detection of amplicons as described herein, the probe 1903comprising a fluorescent moiety and quencher moiety, forward and reverseprimers F1 and R1 designed to amplify a target polynucleotide, dNTPs(not shown), and any other reaction components necessary for carryingout a PCR reaction (e.g., a polymerase, not shown). In some embodiments,the fluorescent moiety of the probe when the probe is in an unhybridizedstate is quenched (denoted by Fi). A PCR reaction may or may not beinitiated by a “hot-start” (not shown). The reaction mixture may thenbegin thermal cycling. Each thermal cycle can comprise a denaturationphase 1910, in which a double-stranded template nucleic acid ispartially or fully denatured into single strands 1911 and 1912.Generally, during this denaturation phase, neither primer hybridizationnor probe hybridization occurs. After denaturation, an annealing phase1920 can be initiated wherein the F1 and R1 primers anneal to the singlestrands of the target polynucleotide. During this phase, a probe 1903for sensitive detection of amplicons would generally not hybridize to atemplate nucleic acid. After the annealing phase, an extension phase1930 can be initiated wherein a polymerase extends the F1 and R1primers, thereby creating two copies of the target polynucleotide 1931and 1932. During this phase, a probe 1903 for sensitive detection ofamplicons would generally not hybridize to a template nucleic acid.Repetition of the thermal cycles can accordingly result in theexponential amplification of the target polynucleotide. After the finalrepeated cycle, a final denaturation step 1940 can be initiated. Thefinal denaturation step can fully or partially denature anydouble-stranded target polynucleotides into single strands 1941.Following the final denaturation step, the reaction mixture can becooled in a cooling step 1950, e.g., cooled to below 50° C. or cooled toabout room temperature. Cooling the reaction mixture can enablehybridization of the disclosure probe to a target polynucleotide in afinal cooled phase 1960. Hybridization of the probe can result in fullextension of the probe and release the detectable moiety from theinfluence of the quencher moiety. The detectable moiety can thus bedetected. In some embodiments, the probe 1903 is a low Tm probe. In somecases, the low Tm probe has a melting point below 50° C. In some cases,the low Tm probe has a melting point of between about 35° C. to 45° C.In some embodiments, the probe 1903 does not have a Tm. In someembodiments, the probe 1903 is not a low Tm probe. In some cases, theprobe 1903 has a melting point greater than 50° C.

In some embodiments, the PCR amplification reaction is a linearamplification reaction. An exemplary embodiment of a method comprisinglinear amplification is depicted in FIG. 20. A starting reaction mixtureor volume 2001 can comprise a template nucleic acid 2002, which may be adouble-stranded template nucleic acid, a probe 2003 for sensitivedetection of amplicons as described herein, the probe 2003 comprising afluorescent moiety and quencher moiety, and a primer F1 designed tohybridize to a single template strand comprising a target polynucleotidein a strand-specific manner, dNTPs (not shown), and any other reactioncomponents necessary for carrying out a PCR reaction (e.g., apolymerase, not shown). In some embodiments, the fluorescent moiety ofthe probe when the probe is in an unhybridized state is quenched(denoted by Fi). A PCR reaction may or may not be initiated by a“hot-start” (not shown). The reaction mixture may then begin thermalcycling. Each thermal cycle can comprise a denaturation phase 2010, inwhich a double-stranded template nucleic acid is partially or fullydenatured into single strands 2011 and 2012. Generally, during thisdenaturation phase, neither primer hybridization nor probe hybridizationoccurs. After denaturation, an annealing phase 2020 can be initiatedwherein the F1 primer anneals to a denatured strand 2012 of the targetpolynucleotide in a strand-specific manner. During this phase, a probe2003 for sensitive detection of amplicons would generally not hybridizeto a template nucleic acid. After the annealing phase, an extensionphase 2030 can be initiated wherein a polymerase extends the F1 primer,thereby creating a copy of the target polynucleotide 2031. During thisphase, a probe 2003 for sensitive detection of amplicons would generallynot hybridize to a template nucleic acid. During this phase, the singlestrand 2011 is generally not amplified. Repetition of the thermal cyclesof denaturation, annealing, and extension can accordingly result in thelinear accumulation of single-stranded amplicons 2041 comprising thetarget polynucleotide. Upon termination of thermal cycling, which canresult in the accumulation of single-stranded products 2041, thereaction mixture can be cooled in a cooling step 2040, e.g., cooled tobelow 50° C. or cooled to about room temperature. Cooling the reactionmixture can enable hybridization of the disclosure probe to a targetpolynucleotide in a final cooled phase 2050. Hybridization of the probecan result in full extension of the probe and release the detectablemoiety from the influence of the quencher moiety. The detectable moietycan thus be detected. In some embodiments, the probe 2003 is a low Tmprobe. In some cases, the low Tm probe has a melting point below 50° C.In some cases, the low Tm probe has a melting point of between about 35°C. to 45° C. In some embodiments, the probe 2003 does not have a Tm. Insome embodiments, the probe 2003 is not a low Tm probe. In some cases,the probe 2003 has a melting point greater than 50° C.

In some embodiments, the PCR amplification reaction is a non-symmetricpolymerase chain reaction (PCR). The non-symmetric PCR reaction caninclude an initial exponential amplification phase followed by a linearamplification phase. In some cases, the transition from an exponentialto a linear amplification phase occurs without addition of reactioncomponents to a reaction mixture or removal of components from thereaction mixture. In some cases, the non-symmetric PCR reaction involvessubjecting a reaction mixture to repeated thermal cycles, wherein thereaction mixture comprises a polynucleotide template target, a pair ofPCR primers, dNTPs, an disclosure probe, and a thermostable polymerase.The thermal cycles can correspond to the PCR steps of denaturation,primer annealing and primer extension, wherein, at the outset of the PCRreaction, the PCR primer pair comprises a limiting primer and an excessprimer. The excess primer can be present at a concentration at least twotimes higher, at least three times higher, at least four times higher,at least five times higher, at least 10 times higher, at least 20 timeshigher, at least 30 times higher, at least 40 times higher, at least 50times higher, at least 100 times higher, at least 200 times higher, atleast 300 times higher, at least 400 times higher, at least 500 timeshigher, or at least 1000 times higher than the limiting primer. Theexcess primer can be present at a concentration that is 2-8× higher,5-10× higher, 10-100× higher, 100-500× higher than the concentration ofthe limiting primer.

For example, the starting molar concentration of the limiting primer canbe less than the starting molar concentration of the excess primer. Theratio of the starting concentrations of the excess primer relative tothe limiting primer can be at least 2:1, 3:1, 4:1, 5:1, 10:1, 20:1, or100:1. The ratio of excess primer to limiting primer can be 5:1, 10:1,15:1, 20:1, 25:1, 30:1, 35:1, 40:1, 45:1, 50:1, 55:1, 60:1, 65:1, 70:1,75:1, 80:1, 85:1, 90:1, 95:1, or 100:1. In some embodiments, the ratiois in the range of 20:1 to 100:1.

An exemplary embodiment of a method comprising exponential amplificationfollowed by linear amplification is depicted in FIG. 21. A startingreaction mixture or volume 2101 can comprise a template nucleic acid2102, which may be a double-stranded template nucleic acid, a probe 2103for sensitive detection of amplicons as described herein, the disclosureprobe comprising a fluorescent moiety and quencher moiety, an excessprimer 2104, and a limiting primer 2105, designed to hybridize toopposite strands of a target polynucleotide, dNTPs (not shown), and anyother reaction components necessary for carrying out a PCR reaction(e.g., a polymerase, not shown). In some embodiments, the fluorescentmoiety of the probe when the probe is in an unhybridized state isquenched (denoted by Fi). A PCR reaction may or may not be initiated bya “hot-start” (not shown). The reaction mixture may then begin thermalcycling. Each thermal cycle can comprise a denaturation phase 2110, inwhich a double-stranded template nucleic acid is partially or fullydenatured into single strands 2111 and 2112. Generally, during thisdenaturation phase, neither primer hybridization nor probe hybridizationoccurs. After denaturation, an annealing phase 2120 can be initiatedwherein primers 2104 and 2105 anneal to the single strands of the targetpolynucleotide. During this phase, a probe 2103 for sensitive detectionof amplicons would generally not hybridize to a template nucleic acid.After the annealing phase, an extension phase 2130 can be initiatedwherein a polymerase extends primers 2104 and 2105, thereby creating twocopies of the target polynucleotide 2131 and 2132. During this phase, aprobe 2103 for sensitive detection of amplicons would generally nothybridize to a template nucleic acid. Repetition of the thermal cyclescan accordingly result in the exponential amplification of the targetpolynucleotide until the limiting primer 2105 is exhausted, after whichthe thermal cycles result in linear amplification of the targetpolynucleotide. The thermal cycles of linear amplification can comprisethe same repeated cycles of denaturation, annealing, and extension asdescribed above. In a denaturation phase 2140, the amplified,double-stranded target polynucleotides 2131 and 2132 and denatured intosingle strands 2141 and 2142. Generally, during this denaturation phase,neither primer hybridization nor probe hybridization occurs. Followingdenaturation, an annealing phase 2150 can be initiated wherein excessprimer 2104 anneals to single strands 2142. During this phase, a probe2103 for sensitive detection of amplicons would generally not hybridizeto a template nucleic acid. After the annealing phase, an extensionphase 2160 can be initiated wherein a polymerase extends primer 2104,thereby creating a copy of the target polynucleotide 2161. During thisphase, an disclosure probe would generally not hybridize to a templatenucleic acid. During this phase, the single strand 2141 is generally notamplified. Repetition of the thermal cycles can accordingly result inthe linear amplification of the target polynucleotide and accumulationof single-stranded products 2171. Upon termination of thermal cycling,which results in the accumulation of single-stranded products 2171, thereaction mixture can be cooled in a cooling step 2180, e.g., cooled tobelow 50° C. or cooled to about room temperature. Cooling the reactionmixture can enable hybridization of the probe 2103 to a targetpolynucleotide in a final cooled phase 2190. Hybridization of the probecan result in full extension of the probe and release the detectablemoiety from the influence of the quencher moiety. The detectable moietycan thus be detected (denoted as F*). In some embodiments, the probe2103 is a low Tm probe. In some cases, the low Tm probe has a meltingpoint below 50° C. In some cases, the low Tm probe has a melting pointof between about 35° C. to 45° C. In some embodiments, the probe 2103does not have a Tm. In some embodiments, the probe 2103 is not a low Tmprobe. In some cases, the probe 2103 has a melting point greater than50° C.

The methods described herein can be used for allelic discriminationassays. FIGS. 22A-22B depict exemplary embodiments of a method forallelic discrimination. In FIG. 22 A, a reaction mixture or reactionvolume can comprise a template nucleic acid, a forward primer andoptionally a reverse primer designed to amplify a region comprising alocus. The locus can be suspected of harboring a mutation. The reactionmixture can further comprise a probe for sensitive detection ofamplicons that, when free in solution, generally does not emit adetectable signal. The probe can be an allele-specific probe that isdesigned to be perfectly matched to a target harboring a particularallele of a locus. In step 2210, PCR amplification can result in thegeneration of a plurality of amplicons comprising the perfectly matchedtarget. In some cases, the amplicons comprise single-stranded amplicons.In some cases, the amplicons can be double stranded amplicons. In suchcases, following PCR amplification the double stranded amplicons can bedenatured, e.g., by heating the reaction mixture to 90-100° C. (notshown). In some cases, PCR amplification cycling parameters areconfigured as to minimize hybridization of the probe to the perfectlymatched template during the PCR reaction. In a next step 2220, thereaction mixture is cooled so as to allow hybridization of the probe tothe perfectly matched target. In some cases, the hybridization of theprobe increases the distance between the detectable moiety and thequencher, enabling detection of the detectable moiety. In FIG. 22B, thetarget harbors a different allele of the locus. Accordingly, the targetis mismatched to the probe. In step 2210, PCR amplification can resultin the generation of a plurality of amplicons comprising the mismatchedtarget. In some cases, the amplicons comprise single-stranded amplicons.In some cases, the amplicons can be double stranded amplicons. In suchcases, following PCR amplification the double stranded amplicons can bedenatured, e.g., by heating the reaction mixture to 90-100° C. (notshown). In some cases, PCR amplification cycling parameters areconfigured as to minimize hybridization of the probe to the templateduring the PCR reaction. In a next step 2220, the reaction mixture iscooled so as to allow hybridization of the probe to the target. However,due to the probe/template mismatch the hybridization of the probe to thetarget can be reduced and/or minimized. In such cases, the probe canremain largely free in solution and therefore remain quenched. In someembodiments, a reaction mixture can comprise a plurality of probes. Inparticular embodiments, each probe of the plurality of probes isspecific for a specific allele of a locus. In some embodiments, eachprobe of the plurality of probes comprises a distinct detectable moietythat is detectably distinct from other moieties of the probes. In someembodiments, the probe is a low Tm probe. In some cases, the low Tmprobe has a melting point below 50° C. In some cases, the low Tm probehas a melting point of between about 35° C. to 45° C. In someembodiments, the probe comprises a fluophore. In some embodiments, theprobe does not have a Tm. In some embodiments, the probe is not a low Tmprobe. In some cases, the probe has a melting point greater than 50° C.

FIG. 23 depicts another exemplary embodiment of a digital PCR method forallele-detection, which utilizes low-Tm probes as described herein forsensitive detection of amplicons in combination with oligonucleotideprimers as described herein which comprise (1) a template binding regionand (2) a probe binding region. In FIG. 23, a reaction mixture orreaction volume can comprise a template nucleic acid 2302 whichcomprises either a wild-type allele 2307 or mutant allele 2308. Thereaction mixture can further comprise a plurality of allele-specificforward primers. The allele-specific forward primers can include a firstallele-specific forward primer Fwd1 (e.g., specific for a wild-typeallele), and at least a second allele-specific forward primer Fwd2(e.g., specific for a mutant allele), each designed to amplify a targetpolynucleotide 2302 suspected of harboring a mutation at a locus. Fwd1can comprise a wild-type barcode region 2305 which does not generallyhybridize to template nucleic acid 2302. The wild-type barcode region2305 may contain a wild-type barcode sequence that specificallyhybridizes a wild-type low Tm probe, but does not substantiallyhybridize a mutant low Tm probe. Fwd1 can further comprise a templatebinding region 2306 which is designed to hybridize to the targetpolynucleotide 2302, and which contains a nt at or near (e.g., within1-3 nts) a 3′ end which is complementary to a wild-type allele 2307. Oneof the forward primers can be a mutant-specific forward primer that iscomplementary to the mutant allele at the site that overlays themutation. Fwd2 can comprise a mutant barcode region 2310 which does notgenerally hybridize to a template nucleic acid. The mutant barcoderegion may contain a mutant barcode sequence that specificallyhybridizes a mutant low Tm probe, but does not substantially hybridizeto the wild-type low Tm probe. Fwd2 can further comprise a templatebinding region 2311 which is designed to hybridize to the targetpolynucleotide 2302, and which contains a nt at or near (e.g., within1-3 nts) a 3′ end which is complementary to a wild-type allele 2308. Theforward primers Fwd1 and Fwd2 may each further comprise a deliberatemismatch nucleotide adjacent to or within 1-3 nucleotides from the ntthat overlays the mutation. However, in some cases, the forward primersdo not further comprise a deliberate mismatch nucleotide adjacent to orwithin 1-3 nucleotides from the nt that overlays the mutation. Thereaction mixture may further comprise wild-type low Tm probe 2303 and amutant low Tm probe 2309. The wild-type low Tm probe 2303 may bedesigned to specifically hybridize to the reverse complement of thewild-type barcode region 2305. The mutant low Tm probe 2309 may bedesigned to specifically hybridize to the reverse complement of themutant barcode region 2310. The wild-type and mutant low Tm probes 2303and 2309 may comprise spectrally distinct fluorophores F1 and F2. Thereaction mixture may further comprise a reverse primer (“Rev”). Thereverse primer may be present in an excess amount as compared to theamount of forward primers, which are present in limited amounts. Thereaction mixture may further comprise a stable DNA polymerase “Pol”, anddNTPs and other components for carrying out an amplification reaction.In a first step, template DNA molecules are contacted with the reactionmixture described above. Forward primers Fwd1 and Fwd2 may hybridize totemplate DNA containing either the wild-type allele 2307 and/or mutantallele 2308. Accordingly, there is a mismatch between the 3′ terminalbase of 2306 and mutant allele 2308, and a mismatch between the 3′terminal base of 2311 and wild-type allele 2307. In a next step, the DNApolymerase “Pol” can promote efficient extension of the Fwd1 primerannealed to template DNA containing 2307 wild-type allele, but does notpromote efficient extension of the Fwd1 primer annealed to template DNAcontaining 2308 mutant allele (due to a greater mismatch between Fwd1and 2308). By the same token, polymerase “Pol” can promote efficientextension of the Fwd2 primer annealed to template DNA containing the2308 mutant allele but does not promote efficient extension of the Fwd2primer annealed to template DNA containing the 2307 wild-type allele(due to a greater mismatch between Fwd2 and 2307). Efficient extensionresults in extension products comprising the reverse complement of thewild-type barcode 2305 or the reverse complement of the mutant barcode2310. In a second (and any subsequent round) of amplification, theexcess Rev primer can anneal to the extension products comprising either2305 or 2310 and (after exhaustion of limiting primers Fwd1 and Fwd2),promote linear amplification of the extension products comprising eitherbarcodes 2305 or 2310. During the amplification cycles, the wild-typeand mutant probes low-TM probes 2303 and 2309 do not hybridize to thebarcodes 2305 and/or 2310. After amplification cycles are completed, thereaction mixture can be cooled, e.g., to about 25° C., thereby allowingthe probes 2303 and 2309 to hybridize to their respective barcodes 2305and 2310. Hybridization of the probes to their respective barcoderegions releases the fluorophores F1 and F2 from their quenchers (Q) andpromotes fluorescence of the fluorophores.

Applications of Sensitive Detection of Amplicons

The methods and kits of the present disclosure may be used for thesensitive and accurate analysis of nucleic acids isolated from asubject. Such detection and analysis can be useful for a wide range ofapplications, including but not limited to diagnostic and/or therapeuticpurposes. By way of example only, the detection methods may be used forthe detection of a mutation in a subject, for diagnosing a disease in asubject, for monitoring disease progression in a subject, for aiding inthe selection of a therapeutic regimen for a disease in a subject, fordetermining the effectiveness of an therapy targeting a disease in asubject, or for evaluating disease prognosis in a subject. Exemplarysubjects are described herein. In some embodiments, nucleic acid from abiological sample isolated from the subject is analyzed using themethods and/or kits described herein for sensitive detection ofamplicons.

Exemplary biological samples are described herein. In particularembodiments, the sample is a tumor sample. In some embodiments, thetumor sample is processed prior to the probe-based assay. Processing cancomprise fixation in a formalin solution, followed by embedding inparaffin (e.g., is a FFPE sample). Processing can alternatively comprisefreezing of the sample prior to conducting the probe-based assay. Insome embodiments, the sample is neither fixed nor frozen. The unfixed,unfrozen sample can be, by way of example only, stored in a storagesolution configured for the preservation of nucleic acid.

In some embodiments, non-nucleic acid materials can be removed from thestarting material using enzymatic treatments (for example, with aprotease). The sample can optionally be subjected to homogenization,sonication, French press, dounce, freeze/thaw, which can be followed bycentrifugation. The centrifugation may separate nucleic acid-containingfractions from non-nucleic acid-containing fractions.

Nucleic acid can be isolated from the biological sample using any meansknown in the art. For example, nucleic acid can be extracted from thebiological sample using liquid extraction (e.g., Trizol, DNAzol)techniques. Nucleic acid can also be extracted using commerciallyavailable kits (e.g., Qiagen DNeasy kit, QIAamp kit, Qiagen Midi kit,QIAprep spin kit).

Nucleic acid can be fragmented in situ or de novo through physical,chemical, or enzymatic means to a uniform distribution.

Nucleic acid can be concentrated by known methods, including, by way ofexample only, centrifugation. Nucleic acid can be bound to a selectivemembrane (e.g., silica) for the purposes of purification. Nucleic acidcan also be enriched for fragments of a desired length, e.g., fragmentswhich are less than 1000, 500, 400, 300, 200 or 100 base pairs inlength. Such an enrichment based on size can be performed using, e.g.,PEG precipitations, an electrophoretic gel or chromatography material(Huber et al. (1993) Nucleic Acids Res. 21:1061-6), gel filtrationchromatography, TSK gel (Kato et al. (1984) J. Biochem, 95:83-86), whichpublications are hereby incorporated by reference.

Polynucleotides extracted from a biological sample can be selectivelyprecipitated or concentrated using any methods known in the art.

The probes, reaction mixtures, kits, methods, and systems describedherein for sensitive detection of amplicons can be utilized in theassessment of a disease in a subject. In some embodiments, the diseaseis a cancer. The method can comprise determining the presence, absence,or level of a mutation in any number of genes of interest. For example,the method can comprise determining the presence, absence or level of amutation in 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89,90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 110, 120, 129, 130, 140,150, 160, 170, 180, 190, 200, or more than 200 genes of interest. Themethod can comprise determining the presence, absence or level of amutation in 1-3, 2-5, 4-10, 5-20, 10-50, 30-100, 50-150, 70-200, or morethan 200 genes of interest. Genes of interest can include anycancer-related genes known in the art. Cancer-related genes aredescribed herein. In some embodiments, the genes of interest aresuspected of harboring a SNP, insertion, deletion, or translocation. Insome embodiments, the genes of interest are suspected of harboring acopy number variation.

The method can involve determining the presence, absence, or level of acopy number variation in a subset of genes. The method can involvedetermining a copy number variation in 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, or morethan 50 genes, e.g., cancer-related genes, relative to a set ofreference genes. In some cases, the method involves determining a copynumber variation of one or more genes (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 genes). The genes can beselected from the group consisting of MET, FGFR1, FGFR2, FLT3, HER3,EGFR, mTOR, CDK4, HER2, RET, DDR2, AURKA, VEGFA, CDK6, JAK2, BRAF, andSRC. In some cases, the method involves determining a copy numbervariation of one or more genes (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, or 13 genes) selected from the group consisting of EGFR, AURKA,VEGFA, FGFR1, CDK4, EFBB2, CDK6, JAK2, MET, BRAF, ERBB3, and SRC. Thereference genes can be, e.g., HADH, ZFP3, RNaseP. The method ofassessing cancer can comprise conducting a probe-based assay forsensitive detection of amplicons as described herein, using a probe forsensitive detection of amplicons as described herein.

One or more methods of the disclosure can be used for copy numbervariation analysis. The methods for copy number variation can comprisetwo assays. The two assays can be a target assay and a reference assay.Each assay can comprise a single primer set or multiple primer sets,wherein each primer set shares a common probe. The target assay canutilize a primer/probe set that is specific for a target region that issuspected of harboring a copy number variation. The target assay canutilize with a single probe with multiple primer sets that are specificfor multiple regions that are suspected of harboring a copy numbervariation. The reference assay can utilize a primer/probe set that isspecific for a reference region that is known or suspected to not harbora copy number variation. The reference assay can utilize a single probewith multiple primer sets that are specific for multiple regions thatare known or suspected not to harbor a copy number variation. The targetand reference regions may be on the same or on different chromosomes.The target region can be a region in any chromosome, for example, aregion in human chromosome 13, 18, 21, X, or Y. Copy number variation ofthe target region can be estimated by any means known in the art, forexample, by a ratio between the estimated target vs. referenceconcentration, or by a statistical analysis of the difference inconcentration of the target vs. the reference region.

In some embodiments, a method for assessing cancer comprises copy numbervariation analysis of 12 genes selected from the group consisting ofVEGFA, EGFR, CDK6, MET, BRAF, FGFR1, JAK2, HER3, CDK4, HER2, SRC, andAURKA in a DNA sample originating from a human subject in need thereof.In some embodiments, the DNA sample is from a tumor biopsy or a tissuebiopsy suspected of harboring tumor DNA. In some embodiments the DNAsample is from a liquid biological sample isolated from the subject.Exemplary liquid biological samples are described herein. In someembodiments, the DNA sample is partitioned into a plurality of reactionmixtures. The DNA sample may be partitioned such that each reactionmixture comprises 0-2 DNA template molecules. Each reaction mixture cancomprise a primer/probe set for sensitive detection of amplicons asdescribed herein. A primer/probe set can be designed to amplify a regionof interest within a gene suspected of having copy number variation(e.g., a gene amplification). Each primer/probe set can comprise aforward primer, reverse primer, and probe. In particular embodiments,each primer/probe set comprises a primer in excess amounts (e.g., excessprimer) compared to a reverse primer in limiting amounts (e.g., limitingprimer). In some embodiments, each primer/probe set can comprise a lowTm probe that is designed to selectively hybridize to a region that islocated between the excess and limiting primer. In some embodiments, thelow Tm probe is designed to hybridize to the 5′ region of the forwardprimer. In some embodiments, the low Tm probe is designed to hybridizeto the 5′ region of the reverse primer. In some embodiments, a regionsuspected of having copy number variation also harbors a site of a knownmutation. In some embodiments, the low Tm probe is designed to overlaythe mutation site. In some embodiments, the low Tm probe is designed tocorrespond to the wild-type allele. In some cases the low Tm probe isdesigned to have a greater number of mismatches to the mutant allelethan to the wild-type allele. In some embodiments, each reaction mixturealso comprises a primer/probe set for a reference gene. The referencegene can be, e.g., RNaseP30, HADH, ZFP3. In some embodiments, thereference primer/probe set comprises a forward primer, reverse primer,and probe. In particular embodiments, the reference primer/probe setcomprises an excess primer and a limiting primer which is designed toamplify a region of the reference gene. In some embodiments, thereference primer/probe set further comprises a low Tm probe which isdesigned to hybridize to a region of the reference gene that is locatedbetween the excess and limiting primer. In some embodiments, thepartitioned reaction mixtures are subject to an amplification reaction.In some embodiments, the amplification reaction comprises PCR cycles,wherein the PCR cycles do not comprise a temperature step that isresults in substantial annealing of the low-Tm probe. In someembodiments, a sufficient number of PCR cycles are performed to exhaustthe limiting primer, thus resulting in linear amplification utilizingthe excess primer. In some embodiments, following the PCR cycles, thereaction mixtures are cooled to a temperature which allows for annealingof the low Tm probes to the amplification products. In some embodiments,following annealing of the low Tm probes, the reaction mixtures areassessed and enumerated for fluorescent detection of the annealed low Tmprobes. In some embodiments, a CNV call is generated based on theassessment and enumeration.

In some embodiments, one or more methods for sensitive detection ofamplicons comprise partitioning the reaction mixture and nucleic acidsample into discrete volumes prior to amplification. For example, theone or more methods can comprise digital PCR. Methods, kits, and systemsfor partitioning/digital PCR are described herein.

Kits for Sensitive Detection of Amplicons

Also provided in the disclosure are kits for the sensitive detection ofamplicons. Kits may include one or more oligonucleotide primers andprobes as described herein. In some embodiments, the primers and/orprobes are capable of selectively detecting an individual allele of alocus. Kits can include, for example, one or more primer/probe sets.Exemplary primer/probe sets are described herein. For example, kits caninclude primer/probe sets for MET, FGFR1, FGFR2, FLT3, HER3, EGFR, mTOR,CDK4, HER2, RET, HADH, ZFP3, DDR2, AURKA, VEGFA, CDK6, JAK2, BRAF, SRCand RPP30. Kits may further comprise instructions for use of the one ormore primer/probe sets, e.g., instructions for practicing a method ofthe disclosure. In some embodiments, the kit includes a packagingmaterial. As used herein, the term “packaging material” refers to aphysical structure housing the components of the kit. The packagingmaterial can maintain sterility of the kit components, and can be madeof material commonly used for such purposes (e.g., paper, corrugatedfiber, glass, plastic, foil, ampules, etc.). Kits can also include abuffering agent, a preservative, or a protein/nucleic acid stabilizingagent. Kits can also include other components of a reaction mixture asdescribed herein. For example, kits may include one or more aliquots ofthermostable DNA polymerase as described herein, and/or one or morealiquots of dNTPs. Kits can also include control samples of knownamounts of template DNA molecules harboring the individual alleles of alocus. In some embodiments, the kit includes a negative control sample,e.g., a sample that does not contain DNA molecules harboring theindividual alleles of a locus. In some embodiments, the kit includes apositive control sample, e.g., a sample containing known amounts of oneor more of the individual alleles of a locus.

Systems for Sensitive Detection of Amplicons

Also provided in the disclosure are systems for the sensitive detectionof amplicons. In some embodiments, the system provides a reactionmixture for sensitive detection of amplicons as described herein. Insome embodiments the reaction mixture is admixed with a DNA sample andcomprising template DNA. In some embodiments, the system furtherprovides a droplet generator, which partitions the template DNAmolecules, probes, primers, and other reaction mixture components intomultiple droplets within a water-in-oil emulsion. Exemplary dropletgenerators are described herein.

Reference Materials for Circulating Nucleic Acid, e.g., DNA

Provided herein are methods and compositions relating to referencematerial for circulating or cell-free nucleic acid, e.g., cfDNA. Nucleicacid used as reference material for cell-free or circulating nucleicacid, e.g., DNA, can be extracted from a known source, e.g., a tissue, acell, or their progeny, of a biological entity. The tissue or cell canbe obtained from a living subject or can be cultured in vitro. Thesubject can be a mammal. The mammal can be a human. The subject can be alaboratory model, such as a mouse, Drosophila, or a rat. The subject canbe a eukaryote. The subject can possess a known germline sequence orgenome sequence.

Non-nucleic acid materials can be removed from a sample, e.g., using anenzymatic treatment (e.g., a protease). The sample can be subjected to atreatment, e.g., homogenization, sonication, French press, dounce, orfreeze/thaw. Following the treatment, the sample can be subjected tocentrifugation. The centrifugation can separate a nucleicacid-containing fraction from a non-nucleic acid-containing fraction.

Nucleic acid can be isolated from the sample using any means known inthe art. For example, nucleic acid can be extracted from the biologicalsample using a liquid extraction (e.g., Trizol, DNAzol) technique.Nucleic acid can also be extracted using a commercially available kit(e.g., Qiagen DNeasy kit, QIAamp kit, Qiagen Midi kit, QIAprep spinkit).

Nucleic acid can be fragmented in situ or de novo through a physical,chemical, or enzymatic mean, e.g., to form a uniform distribution.

Nucleic acid can be concentrated by, e.g., centrifugation. Nucleic acidcan be bound to a selective membrane (e.g., silica), e.g., for thepurpose of purification. Nucleic acid can also be enriched for fragmentsof a desired length, e.g., fragments which have a length, or averagelength, less than 1000, 500, 400, 300, 200 or 100 base pairs or bases.Such an enrichment based on size can be performed using, e.g., PEGprecipitation, an electrophoretic gel or chromatography material (Huberet al. (1993) Nucleic Acids Res. 21:1061-6), gel filtrationchromatography, or TSK gel (Kato et al. (1984) J. Biochem, 95:83-86),which publications are hereby incorporated by reference.

Nucleic acids extracted from a sample can be selectively precipitated orconcentrated, e.g., using any methods known in the art.

Reference Material from Reconstituted Chromatin

Nucleic acid from a first source, e.g., a first biological source thathas been isolated can be reconstructed as chromatin. Methods toreconstruct chromatin can include incubation of isolated nucleic acid,e.g., DNA with purified histones and/or chromatin assembly factors. Forexample, the ACTIVE MOTIF® Chromatin Assembly kit can be used to be usedto reconstruct chromatin. The reassembled chromatin can then be treatedwith an enzyme, e.g., DNase, e.g., DNase I, DNase II, or micrococcalnuclease, e.g., in a protocol similar to a protocol used forhypersensitivity footprinting, or a nebulizer. Treatment of thereassembled chromatin with the enzyme can create nucleic acid fragmentssimilar in size to circulating or cell-free nucleic acid, e.g.,circulating or cell-free DNA. The nucleic acid fragments can have a meanlength of about 140 to about 180 bp or about 150 to about 170 bp.

The nucleic acid fragments can be mixed with nucleic acid, e.g., nucleicacid fragments, from a second source, e.g., a second biological source.The nucleic acid from the second source can be treated in a similarmanner as nucleic acid from the first source to produce nucleic acidfragments with a similar size as nucleic acid fragments from the firstsource. The nucleic acid fragments from the first source can be presentin the mixture at a percentage of the total nucleic acid of about 0.5%to about 50%, about 0.5% to about 25%, about 0.5% to about 10%, or about0.5% to about 5%. The nucleic acid fragments from the first source canbe present in the mixture at about 1%, 2.5%, 5%, 7.5%, 10%, 15%, 20%, or25%. The mixture can be used as a reference for a circulating orcell-free sample, e.g., a cell-free sample comprising cell-free nucleicacid, e.g., DNA, from at least two sources, e.g., a cancerous andnon-cancerous cell. For example, the mixture can be compared to acirculating or cell-free sample.

Reference Material from Nuclei

Nuclei can be extracted from a sample, e.g., a first biological sample.Nuclei can be purified following treatment of cells with mild detergent.The detergent can be a non-ionic detergent or an ionic detergent. Nucleior other cell components can be purified following osmotic shock, suchas treatment with a hypotonic solution. The nuclei can be harvested orpurified by differentially centrifugation, e.g., of lysed cells. Forexample, cells can be lysed, e.g., by treatment with mild detergent orhomogenization. The lysate containing intact nuclei can be placed on adensity-gradient column, containing, e.g., Percoll, sucrose, or cesiumchloride. The density gradient can be continuous. The density gradientcan be a step-wise gradient.

The nuclei, or chromatin extracted from the nuclei, can be treated withan enzyme, e.g., DNase, e.g., DNase I, DNase II, or micrococcalnuclease, or with a nebulizer. The resulting nucleic acid fragments canhave a size similar to the size of circulating or cell-free nucleicacid, e.g., circulating or cell-free DNA. The nucleic acid fragments canhave a mean length of about 140 bp to about 180 bp or about 150 bp toabout 170 bp.

The nucleic acid fragments can be mixed with nucleic acid, e.g., nucleicacid fragments, from a second source, e.g., a second biological source.The nucleic acid from the second source can be treated in a similarmanner as nucleic acid from the first source to produce nucleic acidfragments with a similar size as nucleic acid fragments from the firstsource. The nucleic acid fragments from the first source can be presentin the mixture at a percentage of the total nucleic acid of about 0.5%to about 50%, about 0.5% to about 25%, about 0.5% to about 10%, or about0.5% to about 5%. The nucleic acid fragments from the first source canbe present in the mixture at about 1%, 2.5%, 5%, 7.5%, 10%, 15%, 20%, or25% of the total nucleic acid in the mixture.

The mixture can be used as a reference for a circulating or cell-freesample, e.g., a circulating or cell-free sample comprising cell-freenucleic acid, e.g., DNA, from at least two sources, e.g., a cancerousand non-cancerous cell. For example, the mixture can be compared to acirculating or cell-free sample.

Reference Material Following Induction of Apoptosis or Necrosis

Nuclei or other cell components can be extracted from a sample, e.g., afirst biological sample. Nuclei can be purified following treatment ofcells with mild detergent. The detergent can be a non-ionic detergent oran ionic detergent. Nuclei or other cell components can be purifiedfollowing osmotic shock, such as treatment with a hypotonic solution.The nuclei can be harvested or purified by differentiallycentrifugation, e.g., of lysed cells. For example, cells can be lysed,e.g., by treatment with mild detergent or homogenization. The lysatecontaining intact nuclei can be placed on a density-gradient column,containing, e.g., Percoll, sucrose, or cesium chloride. The densitygradient can be continuous. The density gradient can be a step-wisegradient.

The intact nuclei or cell components can be extracted followingtreatment of the sample containing the nuclei to induce necrosis orapoptosis. Apoptosis can be induced, e.g., by an anti-Fas receptormonoclonal antibody. Apoptosis can be induced, e.g., by chemicaltreatment, such as by addition of doxorubicin, staurosporine, etoposide,camptothecin, paclitaxel, or vinblastine, or a combination of any ofthese chemicals. Apoptosis can be induced by, e.g., binding of nuclearreceptors by glucocorticoid, heat, radiation, nutrient deprivation,viral infection, hypoxia, or increased intracellular calciumconcentration. Necrosis can be induced by, e.g., hypoxia, ischemia, aninfection, toxin, e.g., bacterial toxin, snake venom; frostbite,complement system, activated natural killer cell, peritoneal macrophage,or trauma.

Nucleic acid fragments obtained from a sample following treatment of thesample containing the nuclei to induce necrosis or apoptosis can have asize similar to the size of circulating nucleic acid or cell-freenucleic acid, e.g., circulating or cell-free DNA. The nucleic acidfragments can have a mean length of about 140 to about 180 bp or about150 to about 170 bp.

The nucleic acid fragments can be mixed with nucleic acid, e.g., nucleicacid fragments, from a second source, e.g., a second biological source.The nucleic acid from the second source can be treated in a similarmanner as nucleic acid from the first source to produce nucleic acidfragments with a similar size as nucleic acid fragments from the firstsource. The nucleic acid fragments from the first source can be presentin the mixture at a percentage of the total nucleic acid of about 0.5%to about 50%, about 0.5% to about 25%, about 0.5% to about 10%, or about0.5% to about 5%. The nucleic acid fragments from the first source canbe present in the mixture at about 1%, 2.5%, 5%, 7.5%, 10%, 15%, 20%, or25% of the total nucleic acid in the mixture.

The mixture can be used as a reference for a circulating or cell-freesample, e.g., a circulating or cell-free sample comprising cell-freenucleic acid, e.g., DNA, from at least two sources, e.g., a cancerousand non-cancerous cell. For example, the mixture can be compared to acirculating or cell-free sample.

Reference Material from Culture Media

Reference material for circulating or cell-free nucleic acid, e.g., DNAor RNA, can be obtained from media, e.g., culture media used to culturecells, e.g., human cells, e.g. human cell lines, e.g., human cell linesderived from tumor tissue, e.g., from a specific subject. The media cancomprise nucleic acid, e.g., DNA or RNA, e.g., tumor DNA or tumor RNA,e.g., circulating tumor DNA or circulating tumor RNA, or cell-freenucleic acid, e.g., cell-free DNA or cell-free RNA. The nucleic acidfrom the media can be used as a reference for circulating tumor nucleicacid (e.g., DNA or RNA) or cell-free tumor nucleic acid (e.g., DNA orRNA). In some embodiments, the media comprise cell-free tumor DNAsample. The volume of media from which nucleic acids can be extractedcan be at least, or about, 1 mL, 10 mL, 100 mL, 1 L, 10 L, 100 L, 1000L, 10,000 L, 100,000 L, or 1,000,000 L. The volume of media from whichnucleic acids can be extracted can be about 1 mL to about 1,000,000 L,about 10 mL to about 100,000 L, about 100 mL to about 10 L, or about 1 Lto about 10 L. The nucleic acid fragments obtained from the culturemedia can be similar in size to circulating nucleic acid fragments orcell-free nucleic acid fragments. The nucleic acid fragments can have amean length of about 140 to about 180 bp or about 150 to about 170 bp.

The nucleic acid fragments can be mixed with nucleic acid, e.g., nucleicacid fragments, from a second source, e.g., a second biological source.The nucleic acid fragments from the first source can be present in themixture at a percentage of the total nucleic acid of about 0.5% to about50%, about 0.5% to about 25%, about 0.5% to about 10%, or about 0.5% toabout 5%. The nucleic acid fragments from the first source can bepresent in the mixture at about 1%, 2.5%, 5%, 7.5%, 10%, 15%, 20%, or25% of the total nucleic acid in the mixture.

The mixture can be used as a reference for a circulating or cell-freesample, e.g., a circulating or cell-free sample comprising cell-freenucleic acid, e.g., DNA, from at least two sources, e.g., a cancerousand non-cancerous cell. For example, the mixture can be compared to acirculating or cell-free sample.

Mixtures of Reference Materials

In some embodiments, the method further comprises producing a referencesample by combining nucleic acids from two distinct biological samplesafter treatment using any of the above methods. The method can furthercomprise aliquoting and freezing the reference sample. In someembodiments, the two or more biological samples are cell lines fromreference germline genomes. The DNA can be mixed such that DNA from eachof the two or more biological samples is present in a known ratio. TheDNA from one of the two or more biological samples can be present in theDNA mixture at about 0.01% to about 0.5%, about 0.1% to about 0.5%, orabout 0.5 to about 1%.

These nucleic acids from the two samples can be mixed in severaldilutions that approximate the mixtures from tumor DNA in the backgroundof germline ‘normal’ DNA in a cancer patient, or mother/fetus DNAmixtures present at different times in pregnancy. The better knownsample, such as the sample where the genome variants are known with ahigher level of certainty, can be diluted down to a proportion of about0.5% to about 1%. Such proportions can also be about 0.01% to about0.1%, about 0.01% to about 0.2%, about 0.01% to about 0.3%, about 0.01%to about 0.4%, about 0.01% to about 0.5%, about 0.5 to about 1%, about1% to about 1.5%, about 1.5% to about 2%, about 2% to about 3%, about 3to about 4%, about 4% to about 5%. About 1-5 haploid copies, 1-10haploid copies, 5-10 haploid copies, 10-20 haploid copies, 10-50 haploidcopies, 20-50 haploid copies, 30-50 haploid copies, 40-50 haploidcopies, 50-75 haploid copies, or 75-100 haploid copies of the rarergenome can be in the mixture.

A reference material can be generated using an FFPE reference materialand combined in different proportions. Such proportions can be about0.01% to about 0.1%, about 0.01% to about 0.2%, about 0.01% to about0.3%, about 0.01% to about 0.4%, about 0.01% to about 0.5%, about 0.5%to about 1%, about 1% to about 1.5%, about 1.5% to about 2%, about 2% toabout 3%, about 3% to about 4%, or about 4% to about 5%. About 1-5haploid copies, 1-10 haploid copies, 5-10 haploid copies, 10-20 haploidcopies, 10-50 haploid copies, 20-50 haploid copies, 30-50 haploidcopies, 40-50 haploid copies, 50-75 haploid copies, or 75-100 haploidcopies of the rarer genome can be the mixture.

Reference material can be from urine, blood, semen, saliva, mucosalsecretion, cerebrospinal fluid, amniotic fluid, or plasma from avolunter. Cell-free DNA can be extracted from the volunteer and combinedin different proportions. One sample can be present at about 0.01% toabout 0.1%, about 0.01% to about 0.2%, about 0.01% to about 0.3%, about0.01% to about 0.4%, about 0.01% to about 0.5%, about 0.5 to about 1%,about 1% to about 1.5%, about 1.5% to about 2%, about 2% to about 3%,about 3% to about 4%, or about 4% to about 5% of the total sample. About1-5 haploid copies, 1-10 haploid copies, 5-10 haploid copies, 10-20haploid copies, 10-50 haploid copies, 20-50 haploid copies, 30-50haploid copies, 40-50 haploid copies, 50-75 haploid copies, or 75-100haploid copies of the rarer genome can be in the mixture.

As used herein, the term “about” can mean+/−10%.

EXAMPLES Example 1

FIG. 24 depicts a method used to assess a cancer in a subject. A subjecthad a colonoscopy and is discovered to harbor a colon tumor. A tumorbiopsy and blood draw were collected from the subject at time point 0,and are used to aid in the diagnosis of colon cancer in the subject. Thetumor and normal cells from the first blood draw were sequenced.Sequencing revealed the presence of three mutations in the subject'stumor. The mutations were point mutations in the APC, KRAS, and TP53genes. The stage of the subject's cancer was determined. The subjectunderwent a first treatment (surgery) to remove the tumor. Upon thefirst treatment, a second blood draw was performed. It was determinedthat the subject's tumor had metastasized. The subject was administeredas second therapy (chemotherapy) to manage the cancer. Subsequent blooddraws are performed to assay the mutational status of the three genes incell-free DNA from the blood.

Example 2: Validation Assay for a Tumor-Specific Mutation in the Subjectwith Colon Cancer

NCI-H1573 (CRL-5877) cell lines harboring the KRAS G12A mutation (mu)were obtained as frozen stocks from the American Type Culture Collection(ATCC). Genomic DNA (gDNA) was prepared from cell line material using acommercially available kit (DNeasy Blood & Tissue kit, QIAGEN),according to the manufacturer's suggested protocol. Estimates of DNAconcentration were obtained spectrophotometrically by measuring theOD260 (NanoDrop 1000, Thermo Fisher Scientific Inc.).

Genomic DNA from NA18507 cell lines was used as a surrogate forwild-type DNA (wt) and obtained as purified stocks (Coriell). Twomicroliters of a mixture containing wt (30 ng) and mu (6 ng) DNA wasassembled into a 20 μl ddPCR reaction mix from 2×ddPCR supermix forprobes, 0.2 uM final of each forward primer (wt:5′-AGATTACGCGGCAATAAGGCTCGGTTGGCATTGGATACTACTTGCCTACGCCACC-3′ (SEQ IDNO: 1)); mu:5′AATAGCTGCCTACATTGGGTTCGGTCGTAACTTAGGAACTCTTGCCTACGCCAGC-3′(SEQ ID NO:2), 0.4 uM of reverse primer (5′-CCTGCTGAAaAATGACTGAAT-3′ (SEQ ID NO:3)), and 1 uM each of reporter probes (wt:5′-HEX-CCAACCGAG/ZEN/CCTTATTGCCG-IABkFQ-3′ (SEQ ID NO: 4); mu:5′-FAM-AGTTACGAC/ZEN/CGAACCCAATGTAGG-IABkFQ-3′ (SEQ ID NO: 5)). Each PCRmixture was then converted into droplets for analysis via the QX100ddPCR system according to the manufacturer's suggestions. Annealingtemperature was varied to determine the optimal conditions forsegregating and quantifying the wt (HEX) and mu (FAM) droplet signals(FIG. 25). Resulting clusters were deconvoluted (FIGS. 26A-26D) by usingddPCR mixtures containing only the mu (FIG. 26A), only the wt (FIG.26B), or both probes (FIG. 26C) to assign membership of each cluster asmu or wt.

Example 3: DNA Sample Processing for Target-Enriched Library Preparation

100 ng (˜33000 genome equivalents) of fragmented and/or damaged DNA(e.g. from FFPE samples) was first repaired by excising oxidized andabasic sites through the use of a cocktail of repair enzymes (Endo VIII,Fpg, and UDG) in the presence of T4 polynucleotide kinase, 1 mM ATP, and15% PEG-8000 in 1× ligase reaction buffer at a final reaction volume of100 ul to generate DNA fragments that are terminated by a 5′-phosphateand a 3′-OH.

Repaired DNA was purified using a commercially available kit (GeneJet;Thermo Scientific). Eluted DNA (50 ul) was then concentrated viasedimentation with PEG-8000 (20% final) in the presence of LPA and Trisbuffer containing 10 mM Mg²⁺. The resulting pellet was rinsed once with0.5 ml of 70% ethanol and air-dried for 5 minutes.

Example 4: 5′ Adaptor Ligation of DNA Fragments

Repaired DNA prepared as above was resuspended in 2 ul of nuclease-freewater. Repaired DNA can then be fully or partially denatured eitherchemically, through brief treatment with alkali (NaOH or KOH) followedby neutralization with sodium acetate; or, preferably heat denaturedwith rapid cooling on ice (3 min at 95° C.).

Repaired DNA was pre-adenylated by combining the following components inan adenylation reaction mixture as shown in Table 2:

TABLE 2 Adenylation reaction mixture (DNA sample). 10x NEB4 buffer 0.5μl 1 mM ATP 0.5 μl Thermophilic RNA ligase 0.5 μl 50% PEG-8000 1.5 μlDNA sample + water 2.0 μl

Following incubation for 1 hour at 65° C., the following components(Table 2) were added to the adenylation reaction mixture. To test theeffect of additional ligase, 2 μl of ligase or no additional ligase wasadded to the ligation mix. Optionally, the adenylated product ispurified via sedimentation.

TABLE 3 Ligation Mix 10x NEB4 buffer 4.5 μl 100 uM adaptor 1 μl 25 mMManganese acetate 5.0 μl 50% PEG-8000 13.5 μl Thermophilic RNA ligase 0or 2 μl water (up to final volume of 50 μl)

The reaction was incubated for 1 hr @ 65° C., followed by heatinactivation for 10 min @ 80° C., then by 3 min @ 95° C. 1 μl ofprotease was then added and the reaction incubated for 30 min @ 37° C.followed by heat inactivation for 15 min @ 75° C. The resulting ligationproducts were sedimented to remove unreacted adaptors and washed asdescribed above.

The reaction mixture, in which ligation occurs, can comprise a pH in arange of about pH 1-pH14. In some embodiments, the reaction mixture inwhich ligation occurs comprises a pH of at least pH 7.1, pH 7.2, pH 7.3,pH 7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH 7.9, pH 8, pH 8.1, pH 8.2, pH8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH 8.8, pH 8.9, pH 9, pH 9.5, pH10, pH 10.5, pH 11, pH 11.5, pH 12, pH 12.5, pH 13, or greater. In someembodiments, the reaction mixture, in which ligation occurs comprises apH of about neutral. In some embodiments, the reaction mixture in whichligation occurs comprises a pH of about pH 7.1 to about pH 9, about pH7.5 to about pH 9, about pH 8 to about pH 10, or about pH 7 to about pH8. The pH of a reaction mixture in which ligation occurs can be lessthan pH 14, 13, 12, 11, 10, 9, 8, 7, 6.5, 6, 5.5, 5, 4.5, 4, 3.5, 3,2.5, 2, 1.5, or 1. The pH of a reaction mixture in which ligation occurscan be about pH 5 to about pH 6, about pH 4 to about pH 5, about pH 3 toabout pH 4, about pH 2 to about pH 3, or about pH 1 to about pH2.

Example 5: (3′-End Adaptor Ligation)

Repaired DNA as prepared in Example 1 or 5′-adapted DNA libraries asprepared in Example 2 were resuspended in 2 μl of nuclease-free water.This can then be fully or partially denatured either chemically, throughbrief treatment with alkali (NaOH or KOH) followed by neutralizationwith sodium acetate; or, preferably heat denatured with rapid cooling onice (3 min @ 95 C).

The 3′ adaptor DNA was pre-adenylated combining the following componentsin an adenylation reaction mixture as shown in Table 4:

TABLE 4 Adenylation reaction mixture (3′ adaptor) 10x NEB4 buffer 0.5 μl50% PEG-8000 1.5 μl 1 mM ATP 0.5 μl 100 uM adaptor 2.0 μl ThermophilicRNA ligase 0.5 μl

Following incubation for 1 hour at 65° C., the following components(Table 4), were added to the adenylation reaction mixture. Denatured DNArefers to either repaired DNA as prepared in Example 1 or 5′ adapted DNAas prepared in Example 2. To test the effect of additional ligase, 2 μlof ligase or no additional ligase was added to the ligation mix.Optionally, the adenylated product is purified via sedimentation.

TABLE 5 Ligation Mix Adenylation reaction mixture (3′ adaptor) 5.0 μl10x NEB4 buffer 4.5 μl Denatured DNA 2 μl 25 mM Manganese acetate 5.0 μl50% PEG-8000 13.5 μl Thermophilic RNA ligase 0 or 2 μl water up to finalvolume of 50 μl

The reaction was incubated for 1 hr @ 65° C., followed by heatinactivation for 10 min @ 80 C, then by 3 min @ 95° C.

1 μl of protease is added and the reaction incubated for 30 min @ 37° C.followed by heat inactivation for 15 min @ 75° C.

The resulting ligation products were sedimented and washed as above toremove unreacted adaptors and resuspended in 10 μl of 1×NEB4 with 0.1%BSA.

The reaction mixture, in which ligation occurs, can comprise a pH in arange of about pH 1-pH14. In some embodiments, the reaction mixture inwhich ligation occurs comprises a pH of at least pH 7.1, pH 7.2, pH 7.3,pH 7.4, pH 7.5, pH 7.6, pH 7.7, pH 7.8, pH 7.9, pH 8, pH 8.1, pH 8.2, pH8.3, pH 8.4, pH 8.5, pH 8.6, pH 8.7, pH 8.8, pH 8.9, pH 9, pH 9.5, pH10, pH 10.5, pH 11, pH 11.5, pH 12, pH 12.5, pH 13, or greater. In someembodiments, the reaction mixture, in which ligation occurs comprises apH of about neutral. In some embodiments, the reaction mixture in whichligation occurs comprises a pH of about pH 7.1 to about pH 9, about pH7.5 to about pH 9, about pH 8 to about pH 10, or about pH 7 to about pH8. The pH of a reaction mixture in which ligation occurs can be lessthan pH 14, 13, 12, 11, 10, 9, 8, 7, 6.5, 6, 5.5, 5, 4.5, 4, 3.5, 3,2.5, 2, 1.5, or 1. The pH of a reaction mixture in which ligation occurscan be about pH 5 to about pH 6, about pH 4 to about pH 5, about pH 3 toabout pH 4, about pH 2 to about pH 3, or about pH 1 to about pH2.

Example 6: Quantitation of Ligation Efficiency Via ddPCR

FIG. 27 depicts an exemplary embodiment of a method for quantitatingefficiency of a ligation method described herein. Ligation of nucleicacid molecules (NA) to a biotinylated oligonucleotide (5′ or 3′ adaptor)was performed as described above. The ligation reaction can result inligation products (ligated NA) comprising biotinylated oligonucleotidescovalently linked to sample nucleic acids, and can possibly also resultin unligated sample nucleic acids (unligated NA). Ligation products weresedimented through centrifugation for 20 min @ 22,000 g. Supernatant wasremoved and the pellet was resuspended in 5 ul of 0.1×TET Buffer (1 mMTrisHCl, 0.1 mM EDTA, 0.05% Tween-20, pH=8). Resuspended pellet was madeup to a final volume of 50 μl in 1×NEB+0.1% BSA and 10 μl ofStreptavidin-ferro fluids comprising streptavidin-conjugated magneticparticles (MagCellect, R&D Systems, Minneapolis, Minn.) pre-washed with1×NEB4. Following incubation for 15 min at room temperature, the mixturewas magnetized for 5 minutes. The supernatant containing free andtherefore un-ligated sample nucleic acids was removed. The remainingbound material comprising ligation products was resuspended in 50 μl of1×NEB4+0.1% BSA. Five microliters of the bound and unbound fractionswere interrogated via ddPCR with Taqman assays designed to the RNasePgene locus. Ligation efficiency was calculated as [bound signal]/([boundsignal]+[unbound signal]).

The ligation efficiencies of the 5′ and 3′ adaptor library preparations(Examples 2 and 3) were quantified as above. FIG. 28 depicts ddPCRresults for the 5′ end adaptor ligation and 3′ end adaptor ligationreactions, respectfully. Results depicted in FIG. 28, top panel,indicate that 2-step 5′ end adaptor ligation reactions in which theadenylation and ligation steps were performed serially were highlyefficient. Without additional ligase, the average concentration of boundsignal was 45.35 copies/μl, while the average concentration of unboundsignal was 4.505 copies/μl, indicating a ligation efficiency of 90.9%.With additional ligase, the average concentration of bound signal was36.6 copies/μl, while the average concentration of unbound signal was4.43 copies/μl, indicating a ligation efficiency of about 89%.

Results depicted in FIG. 28, bottom panel, indicate that two-step 3′ endadaptor ligation reactions in which the adenylation and ligation stepswere performed serially were highly efficient. For the traditional1-step ligation reaction in which adenylation and ligation stepsco-occur in one reaction, the average concentration of bound signal was14.25 copies/μl, while the average concentration of unbound signal was36.55 copies/μl, indicating a ligation efficiency of 28%. By contrast,for the two-step 3′ end adaptor ligations performed without furtheraddition of ligase, the average concentration of bound signal was 73.75copies/μl, while the average concentration of unbound signal was 1.49copies/μl, indicating a ligation efficiency of 98%. For the two-step 3′end adaptor ligations performed with further addition of ligase afteradenylation, the average concentration of bound signal was 71.7copies/μl, while the average concentration of unbound signal was 2.38copies/μl, indicating a ligation efficiency of 96.8%. From theseresults, the possibility of serially performing adenylation and ligationreactions in a single reaction mixture was demonstrated. Furthermore, itwas determined that the two-step process in which adenylation andligation are performed separately in a single reaction mixture greatlyenhances ligation efficiency.

Another surprising result is that further addition of ligase to thereaction mixture following adenylation does not appear to enhanceligation efficiency, despite the fact that not only ATP but ligaseconcentration is diluted to the same degree by the further addition ofreaction components (e.g., water, buffer, PEG, Mn²⁺) upon commencementof the ligation step. Without wishing to be bound by theory, it ispossible that adenylated donor nucleic acid molecules remain complexedto the ligase enzyme. Upon dilution of ATP and addition of acceptornucleic acid molecules, the complexed ligase enzyme can be released frominhibition and catalyze the ligation of an acceptor nucleic acidmolecules to the adenylated nucleic acid molecule.

Example 7: Effect of Adaptor Length and PEG-8000 on Ligation Efficiency

Sample DNA was prepared and adenylated as described in Example 2 in areaction mixture comprising 15 or 20% PEG-8000. Following adenylation,adaptors of length 19 nt, 41 nt, or 61 nt were ligated to the adenylatedDNA as described in Example 4. Either Mth RNA ligase or CircLigase IIwere used as the ATP-dependent RNA ligase. FIG. 29 depicts ddPCR resultsfor the above ligation reaction conditions. The results indicate thatadaptor length may affect ligation efficiency, and that in cases whereinCircLigase II is used as the RNA ligase, 20% PEG-8000 may be used toincrease the efficiency of long (e.g., 61 nt) adaptor ligationreactions.

Example 8: Effect of Mn²⁺ Vs. Incubation Temperature in 20% PEG-8000 onLigation Efficiency

Sample DNA was prepared and adenylated using a two-stepadenylation/ligation method as described in Example 4, in a reactionmixture comprising 20% PEG-8000. Either Mth RNA ligase, CircLigase II,or T4 RNA Ligase (representing commercially available ATP-dependent RNAligases) were used. The adenylation and ligation reactions wereincubated at 37, 60, 65, or 70° C. for 1 hour each. The ligationreactions were conducted in the presence of 0, 2.5 mM, 5 mM, or 7.5 mMMn²⁺. FIG. 30 depicts ddPCR results for the above ligation reactionconditions. The Y axis is shown in logarithmic scale. Accordingly, anydifferences in bound vs. free signal that is greater than the distancebetween the Y axis gridlines (e.g., labeled on the Y axis) indicates aligation efficiency of 90% or greater. These results indicate thatreaction conditions can be tailored to produce over 90% ligationefficiency for all commercially available ATP-dependent RNA ligases, andthat Mn²⁺ appears to facilitate the ligation step.

Example 9: (Optional) Linear Expansion of 3′-End Adapted Libraries

A 5 μl aliquot of a resuspended 3′-end library prepared according toExample 3 is assembled into the following mixture (Table 6) for linearexpansion:

TABLE 6 Linear Expansion Reaction Mixture adapted DNA library 10.0 μl 5xPhusion buffer (New England Biolabs) 20.0 μl DMSO 3.0 μl 10 mM dNTP 2.0μl 100 uM expansion primer (at least partially 0.5 μl complementary toadaptor) water 67.0 μl Phusion (2 U/μl) (New England Biolabs) 1 μl

The adapted library is expanded according to the following cyclingparameters: 3 min at 98° C.; 10 s at 98° C., 10 s at 68° C., 5 min at72° C., 20 cycles; 5 min at 72° C.; 4° C. hold.

Upon completion, the entire reaction is incubated with 10 μl ofStreptavidin-ferrofluids comprising streptavidin-conjugated magneticparticles (MagCellect, R&D Systems, Minneapolis, Minn.) prewashed with1×NEB4, for 30 min at 37° C.

The solution is magnetized for 5 minutes, and the solution phasecontaining expanded library members are removed.

The solution phase is extracted with phenol:chloroform:isoamyl alcohol(25:24:1) and the aqueous layer precipitated with 1 volume of 5MNH4.acetate, and 1 volume of isopropanol.

After incubation for 20 min at −20° C., the solution is centrifuged for30 min at 22,000 g at 4° C.

The resulting pellet is washed once with 500 μl of 70% ethanol, andair-dried for 5 minutes.

Example 10: Oligo-Selective Finishing (Reverse OS-Seq)

DNA library members comprising a single 5′ adaptor sequence may undergotarget-selective addition of a 3′ adaptor sequence. Methods for theaddition of a 3′ adaptor sequence to desired target regions aredescribed in, e.g., US Patent Application Publication No. 20120157322,hereby incorporated by reference. 5′-adapted libraries preparedaccording to Example 4, optionally expanded according to Example 9, areresuspended in 1×NEB4 with 0.1% BSA added to the following mix (Table7):

TABLE 7 Annealing mixture adapted DNA library 10.0 μl 5x Phusion buffer16.0 μl 4 uM OS-seq probe set  5.0 μl DMSO  3.0 μl water 46.0 μl

The above reaction mix is denatured and annealed under the followingparameters: 2 min @ 95° C.; 10 s @ 95° C., −1° C./cycle, 0.1° C./s, 24cycles; 30 min @ 72° C.

The annealed mixture is then extended by adding the following polymerasemixture (Table 8):

TABLE 8 polymerase mixture adapted DNA library 80.0 μl  5x Phusionbuffer 4.0 μl 10 mM dNTPs 2.0 μl water 13.0 μl  Phusion (2 U/μl) 1.0 μl

After incubation for 10 min @ 72° C., the reaction is brought to 37° C.

Unfinished fragments and unextended oligonucleotides can then beoptionally removed by incubation with Exonuclease I or Exo-SAP IT for 30minutes.

1 μl of protease is added and the reaction incubated for 30 min @ 37° C.followed by heat inactivation for 15 min @ 75° C.

Reactions are then purified via sedimentation with 1 volume of a2×PEGppt solution (1×NEB4, 10 ug LPA, 30% PEG-8000).

Example 11: Oligo-Selective Finishing with Expansion (Reverse OS-Seq)

5′-adapted libraries prepared according to Example 4, optionallyexpanded according to Example 9, are annealed as described in Example10.

Following incubation for 10 min @ 72° C., the products are expandedimmediately according to the following cycling parameters: 10 s @ 98°C., 10 s @ 68° C., 2 min @ 72° C., 20 cycles; 5 min @ 72° C.; 4° C.hold.

Extended products are then double-stranded by addition of an extensionprimer.

Unfinished fragments and unextended oligonucleotides are then removed byincubation with Exonuclease I or Exo-SAP IT for 30 minutes.

1 μl of protease is added and the reaction incubated for 30 min @ 37° C.followed by heat inactivation for 15 min @ 75° C.

Reactions are then purified via sedimentation with 1 volume of a2×PEGppt solution (1×NEB4, 10 ug LPA, 30% PEG-8000).

Example 12: Library Circularization

DNA library members comprising a single adaptor sequence at a first endmay undergo target-selective addition of a second adaptor sequence at asecond end using a library circularization method. Exemplary librarycircularization methods are described in U.S. Patent Application Pub.No. 20120003657, hereby incorporated by reference. 3′-end adaptedlibrary fragments are prepared as above using a non-palindromic hexamer(e.g., as described in in U.S. Patent Application Pub. No. 20120003657as the 3′ adaptor.

A circularization adaptor (in U.S. Patent Application Pub. No.20120003657), possessing a T7 promoter sequence and 3′-overhangscomplementary to the 3′-end adaptor is annealed to the 3′-adaptedlibrary fragments at a 10-fold molar excess.

The fragments are then ligated by the addition of T4 DNA ligase,creating target region-bearing circular products. Alternatively, apolymerase can be used to create the target region-bearing circularproducts.

Linear products are removed through incubation with a cocktail of ExoIII and Exo I.

Example 13: Fluorescently Labeled Library

5′-end adapted library fragments are prepared as above in Example 4using a fluorescently labeled (Cy3, Cy5, FAM, HEX etc) oligo-dT hexamersas the 5′ adaptor. The resulting ligation products can be hybridized toan array CGH system, bead-array system, etc.

Example 14: Direct Sequencing

A 5′-adenylated oligonucleotide (chemically or enzymatically) terminatedwith a 3′-end blocking group “x” (dideoxy-dNTP, biotinylated, etc.) andpossessing a primer site as well as a region complementary to thesurface bound oligonucleotide (flow-cell or bead) is ligated to the3′-end of the native DNA mediated by truncated or mutated RNA ligase 2from T4 or Mth as described in Example 3:

5′-P-DNA-OH-3′+5′-Ad-adaptorB-x-3′=>5′-P-DNA-adaptorB-x-3′

This is then followed by ligation of a second ssDNA adaptor using RNAligase or CircLigase that contains a second primer site as well as theregion complementary to the other surface bound oligonucleotide(flow-cell or bead), to create a full length product that can bedirectly sequenced. The second ligation can be performed as described inExample 2:

5′-HO-adaptorA-OH-3′+5′-P-DNA-adaptorB-x-3′=>5′-HO-adaptorA-DNA-adaptorB-x3′

Alternatively, fragmented DNA can be dephosphorylated upon repair (asabove):

5′-P-DNA-OH-3′=>5′-HO-DNA-OH-3′

Following desphosphorylation and denaturation (alkaline or heat), aphosphorylated adaptor (chemically or enzymatically) can be ligated tothe fragmented DNA with CircLigase:

5′-HO-DNA-OH-3′+5′-P-adaptorB-x-3′=>5′-HO-DNA-adaptorB-x-3′

This adaptor-modified library can then be enzymatically phosphorylatedwith T4 polynucleotide kinase:

5′-HO-DNA-adaptorB-x-3′=>5′-P-DNA-adaptorB-x-3′

A second adaptor can then be introduced by ligation with CircLigase:

5′-P-DNA-adaptorB-x-3′+5′-HO-adaptorA-OH-3′=>5′-HO-adaptorA-DNA-adaptorB-x-3′

The resulting library member can then be sequenced directly as followsusing either the Illumina flow-cell system or bead based systems(Ion-torrent/Roche 454). FIG. 31 depicts an exemplary embodiment ofsequencing using an Illumina NGS platform.

Example 15: Preparation of Oligonucleotide Primers for Capture andEnrichment of Target Sequences

A series of python scripts were created to generate a set ofoligonucleotide primers for capture and enrichment of target sequencesfrom a nucleic acid sample. Exon locations corresponding to genes listedin Table 9 below were curated from CCDS release 15.

TABLE 9 List of genes for exon capture ABL1 AKT1 ALK APC ATM AURKA AURKBAXL BCL2 BRAF BRCA1 BRCA2 CCND1 CDH1 CDK2 CDK4 CDK5 CDK6 CDK8 CDK9 CDK12CDKN2A CEBPA CSF1R CTNNB1 CYP2D6 DDR2 DNMT3A DPYD EGFR EPCAM ERBB2 ERBB3ERBB4 ERCC1 ERCC2 ERCC3 ERCC5 ERCC6 EZH2 ESR1 FGFR1 FGFR2 FGFR3 FGFR4FLT3 GNA11 GNAQ GNAS HNF1A HRAS IDH1 IDH2 JAK2 JAK3 KDR KIT KRAS MAP2K1MAP2K2 MAPK1 MET MLH1 MPL MRE11A MSH2 MTOR MSH6 MYC MUTYH NOTCH1 NPM1NRAS PARP1 PARP2 PDGFRA PIK3CA PMS2 PTCH1 PTCH2 PTEN PTPN11 RB1 RETRUNX1 SMAD4 SMARCB1 SMO SRC STK11 TET2 TP53 UGT1A1 VEGFA VHL WT1

Entries with overlapping exon locations were merged to create a singleentry spanning the overlapping exons. Generated co-ordinates were thenused to extract sequences from the corresponding human reference genomebuild (GRCh37.p13) with a 600 base pad at both the 5′ and 3′ ends.Oligonucleotide sequences for both sense and reverse complement strandsflanking the exon were then identified according to the followingcriteria: (1) between 10 and 36 nucleotides in length; (2) possessing a70% fractional annealing temperature between 56° C. and 60° C.; (3)possessing a GC content between 30% and 70%; (4) possessing C or Ghomopolymer stretches less than 4 contiguous bases; (5) Absence ofpalindromic sequences of 6 or greater; (6) less than 50%self-complementarity. Upon identification of exon-flanking oligos, thelargest interprobe distance less than 300 bases was calculated such thatan even number of (+) and (−) oligonucleotide probes could be created ifthe distance between the exon-flanking oligonucleotides is greater than300, the region between the two flanking oligos is further divided, suchthat the region has a minimal, even number of probes that divide theregion of interest. These positions were used to create search windowsto identify oligonucleotide probes according to the criteria outlinedabove. Capture sequences designed to tile about every 300 nt of thesense and anti-sense strands corresponding to exons of the genes inTable 9 (above) were identified (e.g., SEQ ID NOS 125-1947).

Oligonucleotide capture sequences were appended to the 3′-end of astandard barcoded Illumina P5 adaptor sequences to create the set oftarget-selective oligonucleotide (TSO) primers targeting sense andreverse complement strands received unique barcodes. A schematic of anexemplary TSO primer is shown in FIG. 32. Primers were individuallysynthesized using standard phosphoramidite chemistry, e.g., with 2phosphorothioates linkages at the 3′-terminal and penultimate bases(Integrated DNA Technologies). TSO primers were pooled by strand. Allsense strand TSOs were pooled as TSO Set 1 primers (SEQ ID NOS1948-3770). All reverse strand TSOs were pooled as TSO Set 2 primers(SEQ ID NOS 3771-5593).

Example 16: Multiplex Targeted Sequencing with Barcoding

The following protocol is designed to process a plurality of purifiedDNA samples simultaneously. These samples can be derived fromformalin-fixed paraffin-embedded tissue (FPET) material, from flashfrozen tissue (FFT), or from a liquid sample (e.g., whole blood or asubstantially cell-free sample such as plasma or serum, urine, mucus,etc. DNA in the samples are fragmented by shearing. The average lengthof fragmented DNA is about 100-500 base pairs (bp) on average.

Stage 1: DNA Repair (Approximate Time 1.5 Hrs)

Fragmented DNA samples are admixed in a reaction mixture comprising therepair enzymes formamidopyrimidine [fapy]-DNA glycosylase) (Fpg, NewEngland Biolabs), Uracil-DNA Glycosylase (UDG, New England Biolabs),Endonuclease VIII (EndoVIII, New England Biolabs), and RNase if (NewEngland Biolabs). The samples are then incubated at 37° C. and then heatinactivated at 75° C. according to the manufacturer's instructions. Thisreaction serves to remove damaged bases and to remove contaminating RNAfrom the sample. Upon completion of the reaction, samples are thenincubated with T4 Polynucleotide kinase (PNK, New England Biolabs) inorder to phosphorylate 5′ ends of the DNA fragments. Upon completion ofthe PNK reaction, samples are then incubated with terminal nucleotidtyltransferase (TdT) enzyme (New England Biolabs) to block 3′ hydroxylgroups of the DNA fragments with the addition of dideoxynucleotides.

Upon completion of the TdT reaction, repaired DNA fragments comprising5′ phosphates and blocked 3′ hydroxyl groups are purified using magneticbeads (SeraMAG, Thermo Fisher), and then quantified using, e.g., theDroplet Digital PCR PrimePCR RPP30 assay (#100-31243) or Qubit ssDNAassay kit (in conjunction with a Bioanalyzer/Experion system.

Adaptor Ligation of Sample DNA

The purified and quantitated DNA samples are ligated to adaptoroligonucleotides comprising a sample-specific barcode. Adaptoroligonucleotides generally have sequence structure as shown in FIG. 33.100-300 ng of repaired and 5′ phosphorylated sample DNA and adaptors areheat-denatured in separate tubes by heating to 95° C., resulting insingle-stranded sample DNA and single-stranded adaptors. Sample ssDNA isthen admixed with an adenylation reaction mixture comprising CircLigaseII, 0.1 mM ATP, 15% PEG-8000, and other buffer components. Theadenylation reaction mixture comprising the sample DNA is then incubatedfor at least 5 minutes at 65° C. to effect highly efficient adenylationof the sample ssDNA. Meanwhile, adaptor ssDNA is admixed with a Dilutionbuffer comprising 5 mM MnCl₂, 15% PEG-8000, and other buffer components.The Dilution buffer comprising adaptor ssDNA is then incubated for atleast 5 minutes at 65° C. Upon completion of the adenylation reaction,adenylated sample ssDNA is diluted at least 10-fold with the Dilutionbuffer comprising adaptor ssDNA. This results in a final ATPconcentration of 0.01 mM and addition of Mn²⁺ to the reaction, whicheffectively drive the ligation reaction to completion. Ligation of thesingle-stranded adaptors to the sample ssDNA results in creation of thessDNA library. The adenylation and ligation reactions altogether can becompleted in approximately 1.5 hours. ssDNA library members are thenpurified using magnetic beads (SeraMAG, Thermo Fisher).

Target Enrichment (Approximately 2 Hours)

Approximately 50-150 ng of ssDNA library members are incubated inseparate amplification reaction mixtures comprising 0.5 μM of either TSOSet 1 primers or with TSO Set 2 primers from Example 15. Separation ofthe TSO Set 1 primers and TSO Set 2 primers ensures that only linearamplification of target regions occurs. Amplification reaction mixturesalso comprise a high-fidelity DNA Polymerase (Phusion Hot Start II,Thermo Scientific), dNTPs, and other reaction components necessary forconducting an amplification reaction. 40 cycles of amplification areperformed using a thermocycler. Linear amplification results in captureand enrichment of selected target regions corresponding to exons of the96 cancer genes in Table 9, wherein each captured target regioncomprises a first adaptor comprising a sample index barcode at first endand a second adaptor comprising a strand-specific barcode at the otherend. Captured targets are quantified as described herein and normalizedto 1 nM (or 12×10⁶ copies/μL) for sequencing on a MiSeq sequencer(Illumina).

Example 17: Assessment of Low Tm Probe Designs

Genomic DNA was harvested from a tumor sample known to harbor stopmutation in codon 1306 of the APC gene (c3916G>T) as determined viasequencing. Similarly, wild-type DNA (NA18507) was obtained fromCoriell. Both samples were quantified with ddPCR using RPP30. To assessthe performance of various probe designs targeting the APC mutation, aseries of probes were designed as depicted in Table 10, below.

TABLE 10 low Tm probe designs Wt Mu 5′-nuclease HEX-ACCCTGCAAATFAM-ACCCTGCAAAT AGCAGAAATAAAAGA AGCATAAATAAAAGA AAAG-IBlkFq AAAG-IBlkFq(SEQ ID NO: 6) (SEQ ID NO: 7) Pleaides 1 MGB-AP525-TTATT MGB-FAM-TTTATTTTCTGCTATTTG ATGCTATT*T*G (SEQ ID NO: 8) (SEQ ID NO: 9; Note: * denotethat the nt before is a superbase) Pleaides 2 MGB-AP525-TTATTMGB-FAM-TTTATTT TCTGCTAT*T*T*G ATGCTA*TTT*GC (SEQ ID NO: 10; (SEQ ID NO: 11; Note, * denote Note:* denote  that the nt  that the ntbefore is a before is a superbase) superbase) Pleaides 3 MGB-AP525-TTATTMGB-FAM-TTTAT*T TCTGCTAT*T*T*GC *TATGCTA*TT*T*G (SEQ ID NO: 12) C(SEQ ID NO: 13) Miniprobes MGB-FAM-TTATT*T MGB-AP525-TTATT 1 ATGCT*TCTGCT (SEQ ID NO: 14) (SEQ ID NO: 15) Miniprobes MGB-AP525-TTATTMGB-FAM-TTATT*T 2 TCTGC ATGC (SEQ ID NO: 16) (SEQ ID NO: 17)

Probes were incorporated into ddPCR reactions mixes as depicted in Table11 below as and formed into droplets

TABLE 11 ddPCR reaction mix 2x Droplet PCR Supermix 10.0 μl  Water 3.2μl DNA 2.0 μl 10 uM sense primer (1 uM final) 2.0 μl 10 uM antisenseprimer (0.2 uM final) 0.4 μl 10 uM mu probe 1.2 μl 10 uM wt probe 1.2 μl

Thermocycling protocol was as follows:

10 min @ 95° C.; 30 s @ 95° C., 1 min @ 58 C, 40 cycles; 10 min @ 98°C.; hold at 12° C.

Following thermocycling, reactions were analyzed with the QX100 reader.FIG. 34A shows the use of standard 5′-nuclease probes for the APCtarget. FIG. 34 B shows the use of 3 version of Pleiades probes foranalysis, showing poorer performance relative to the standard nucleaseassays. FIG. 34 C shows the use of 2 versions of miniprobes, indicatinga higher specificity obtained versus the Pleiades probes and thestandard 5′-nuclease probes as indicated by the separation of thewild-type (green) and mutant (blue) clusters.

To determine if the use of miniprobes only required probes of sufficientlength, a pair of probes to the RNaseP locus (RPP30) were designed asfollows:

TABLE 12 RNaseP assay Wt Mu 5′- 5-/5HEX/AAGTTACT 5-/56-FAM/TGATACnuclease ATCAGCCCTTCCTG/ TGTTCAGAGGTGGTGC 3IABkFQ/-3 TAG/3IABkFQ/-3(SEQ ID NO: 18) (SEQ ID NO: 19) Mini- 5-/5HEX/TTTACTAT 5-/56-FAM/TTACTGprobes CAGCCTT/ ATACTGTTTT/ 1 3IABkFQ/-3 3IABkFQ/-3 (SEQ ID NO: 20)(SEQ ID NO: 21)

Probes were assessed as described above. As seen in FIG. 34 D, while theminiprobes (right panel) exhibited higher background fluorescence,likely due to poorer quenching of the 15-mer versus the shorter 11-merof the Pleiades-based miniprobes, separation was sufficient to discerndistinct clusters, allowing reproducible concentration calls relative tothe standard 5′-nuclease probes.

Example 18: Allelic Discrimination Assay Using Low Tm Probes andBarcoded Primers

Primer/probe sets to assay the c.1799T>A (V600E) BRAF mutation weregenerated and tested. Each primer/probe set tested included the commonanti-sense primer CATGAAGACCTCACAGTAAA (SEQ ID NO: 22), wild-type probeHEX-TAAGGCTCGGTT-BHQ (SEQ ID NO: 23), and mutant probeFAM-TTGGGTTCGGTC-BHQ (SEQ ID NO: 24). Various designs of wild-type andmutant sense primers were tested. All wild-type sense primers comprisethe barcode sequence GGCAATAAGGCTCGGTTGGCATTGG (SEQ ID NO: 25) whichcorresponds to the wild-type probe sequence, and all mutant senseprimers comprise the barcode sequence ACATTGGGTTCGGTCGTAACTTAGGAA (SEQID NO: 26) which corresponds to the mutant probe sequence.

Wild-type specific sense primers were designed such that the mutationsite lies under the ultimate (0) or the penultimate (−1) base. Primerswere therefore designed to either contain a deliberate mismatch 1-3 ntsaway from the mutation site or to not contain any additional mismatch.

The following BRAF wild-type sense primers were designed according toTable 13 below.

TABLE 13 BRAF wild-type sense primer designs Primer design SequenceBRAF_1799T_(-1a:-2c) GGCAATAAGGCTCGGTTGGCAT TGGCACTCCATCGAGATTTCAC(SEQ ID NO: 27) BRAF_1799T_(-1a:-2c > a) GGCAATAAGGCTCGGTTGGCATTGGCACTCCATCGAGATTTaAC (SEQ ID NO: 28) BRAF_1799T_(-1a:-2c > g),GGCAATAAGGCTCGGTTGGCAT TGGCACTCCATCGAGATTTgAC (SEQ ID NO: 29)BRAF_1799T_(-1a:-2c > t GGCAATAAGGCTCGGTTGGCAT TGGCACTCCATCGAGATTTtAC(SEQ ID NO: 30) BRAF_1799T_(-1a:-3t) GGCAATAAGGCTCGGTTGGCATTGGCACTCCATCGAGATTTCAC (SEQ ID NO: 31) BRAF_1799T_(-1a:-3t > a)GGCAATAAGGCTCGGTTGGCAT TGGCACTCCATCGAGATTaCAC (SEQ ID NO: 32)BRAF_1799T_(-1a:-3t > c) GGCAATAAGGCTCGGTTGGCAT TGGCACTCCATCGAGATTcCAC(SEQ ID NO: 33) BRAF_1799T_(-1a:-3t > g) GGCAATAAGGCTCGGTTGGCATTGGCACTCCATCGAGATTgCAC (SEQ ID NO: 34) BRAF_1799T_(0a:-1c)GGCAATAAGGCTCGGTTGGCAT TGGCCACTCCATCGAGATTTCA (SEQ ID NO: 35)BRAF_1799T_(0a:-1c > a) GGCAATAAGGCTCGGTTGGCAT TGGCCACTCCATCGAGATTTaA(SEQ ID NO: 36) BRAF_1799T_(0a:-1c > g) GGCAATAAGGCTCGGTTGGCATTGGCCACTCCATCGAGATTTgA (SEQ ID NO: 37) BRAF_1799T_(0a:-1c > t)GGCAATAAGGCTCGGTTGGCAT TGGCCACTCCATCGAGATTTtA (SEQ ID NO: 38)BRAF_1799T_(0a:-2t) GGCAATAAGGCTCGGTTGGCAT TGGCCACTCCATCGAGATTTCA(SEQ ID NO: 39) BRAF_1799T_(0a:-2t > a) GGCAATAAGGCTCGGTTGGCATTGGCCACTCCATCGAGATTaCA (SEQ ID NO: 40) BRAF_1799T_(0a:-2t > c)GGCAATAAGGCTCGGTTGGCAT TGGCCACTCCATCGAGATTcCA (SEQ ID NO: 41)BRAF_1799T_(0a:-2t > g) GGCAATAAGGCTCGGTTGGCAT TGGCCACTCCATCGAGATTgCA(SEQ ID NO: 42)

The following BRAF mutant sense primers were designed according to Table14 below.

TABLE 14 mutant BRAF sense primer designs Primer design SequenceBRAF_1799T > A_(-1a > t:-2c) ACATTGGGTTCGGTC GTAACTTAGGAACACTCCATCGAGATTTCT C (SEQ ID NO: 43) BRAF_1799T > A_(-1a > t:-2c > a)ACATTGGGTTCGGTC GTAACTTAGGAACAC TCCATCGAGATTTaT C (SEQ ID NO: 44)BRAF_1799T > A_(-1a > t:-2c > g) ACATTGGGTTCGGTC GTAACTTAGGAACACTCCATCGAGATTTgT C (SEQ ID NO: 45) BRAF_1799T > A_(-1a > t:-2c > t)ACATTGGGTTCGGTC GTAACTTAGGAACAC TCCATCGAGATTTtT C (SEQ ID NO: 46)BRAF_1799T > A_(-1a > t:-3t) ACATTGGGTTCGGTC GTAACTTAGGAACACTCCATCGAGATTTCT C (SEQ ID NO: 47) BRAF_1799T > A_(-1a > t:-3t > a)ACATTGGGTTCGGTC GTAACTTAGGAACAC TCCATCGAGATTaCT C (SEQ ID NO: 48)BRAF_1799T > A_(-1a > t:-3t > c) ACATTGGGTTCGGTC GTAACTTAGGAACACTCCATCGAGATTcCT C (SEQ ID NO: 49) BRAF_1799T > A_(-1a > t:-3t > g)ACATTGGGTTCGGTC GTAACTTAGGAACAC TCCATCGAGATTgCT C (SEQ ID NO: 50)BRAF_1799T > A_(0a > t:-1c) ACATTGGGTTCGGTC GTAACTTAGGAACCACTCCATCGAGATTTC T (SEQ ID NO: 51) BRAF_1799T > A_(0a > t:-1c > a)ACATTGGGTTCGGTC GTAACTTAGGAACCA CTCCATCGAGATTTa T (SEQ ID NO: 52)BRAF_1799T > A_(0a > t:-1c > g) ACATTGGGTTCGGTC GTAACTTAGGAACCACTCCATCGAGATTTg T (SEQ ID NO: 53) BRAF_1799T > A_(0a > t:-1c > t)ACATTGGGTTCGGTC GTAACTTAGGAACCA CTCCATCGAGATTTt T (SEQ ID NO: 54)BRAF_1799T > A_(0a > t:-2t) ACATTGGGTTCGGTC GTAACTTAGGAACCACTCCATCGAGATTTC T (SEQ ID NO: 55) BRAF_1799T > A_(0a > t:-2t > a)ACATTGGGTTCGGTC GTAACTTAGGAACCA CTCCATCGAGATTaC T (SEQ ID NO: 56)BRAF_1799T > A_(0a > t:-2t > c) ACATTGGGTTCGGTC GTAACTTAGGAACCACTCCATCGAGATTcC T (SEQ ID NO: 57) BRAF_1799T > A_(0a > t:-2t > g)ACATTGGGTTCGGTC GTAACTTAGGAACCA CTCCATCGAGATTgC T (SEQ ID NO: 58)

Ability to discriminate mutant from wild-type species with theseprimer/probe sets was assessed by digital PCR. 20× stocks ofprimer/probe sets were created as follows:

TABLE 15 Primer/probe set stocks Component Volume (ul) 100 uM antisenseprimer 5 100 uM sense primer 1 100 uM probe 2 TE buffer 17

To prepare sample DNA, a mixture of 10% mutant (RKO-1, ATCC) inwild-type (NA18507, Coriell) genomic DNA was created (˜250 and ˜2500copies/μl, respectively). Alternatively, a dilution series of mutant DNA(RKO-1) in background of a wild-type control (purified genomic DNA fromwhole blood) was created.

ddPCR reactions were assembled as shown in Table 16 below.

TABLE 16 ddPCR reaction Component Volume (μl) 2x droplet PCR supermix(Bio-Rad) 10 20x mutant primer/probe set 1 20x wild-type primer/probeset 1 water 6 Sample DNA 2

ddPCR reaction mixes were converted to droplets and cycled on a C1000thermocycler (Bio-Rad) according to the following parameters: 10 min @95° C.; 1 min @ 50-60° C., 45 cycles; 5 min @ 70° C.; 4° C. hold.Thermocycled reactions were then analyzed with the QX-100 ddPCR readerwith Quantasoft v1.4. Results are depicted in FIGS. 35A-35B, 36A-36B,37A-37B, 38A-38B, 39, and 40. In FIGS. 35A-35B, 36A-36B, 37A-37B, and38A-38B, the Y axis denotes intensity of Channel 1 fluorescence(fluorescence of mutant probe, FAM), the X axis denotes intensity ofChannel 2 fluorescence (fluorescence of wild-type probe, HEX). Gridlinesare spaced every 500 intensity units apart, with X and Y axis maxima of3000 intensity units. FAM fluorescence-positive droplets are circled inblack ovals, HEX fluorescence-positive droplets are circled in grayovals, and droplets that are positive for both HEX and FAM are circledin hatched ovals. For all panel B graphs in FIGS. 35A-35B, 36A-36B,37A-37B, and 38A-38B, dark gray data points denote concentration ofmutant alleles as copies/W, and light gray data points denoteconcentration of wild-type alleles as copies/W.

FIGS. 35A-35B depict results from ddPCR assays wherein the sense primerswere designed to overlay the mutation site at the ultimate (0) base, andto either contain a mismatch at the base immediately adjacent to themutation site (−1) or to not contain a further mismatch. Probes designedto overlay the mutation site at the ultimate (0) base and to have a ntmismatch adjacent to the mutation site resulted in distinguishableclusters of wild-type and mutant species, with greater separation ofclusters at the lower temperatures (e.g., −50 to −58° C.).

FIGS. 36A-36B depict results from ddPCR assays wherein the sense primerswere designed to overlay the mutation site at the ultimate (0) base andto either to contain a mismatch 2 bases away from the mutation site (−2)or to not contain a further mismatch. FIGS. 36A-36B depict results fromthe assay. Sense primers which overlay the mutation at the 0 base andwhich contain a T to C substitution at the −2 base resulted in the mosthighly distinguishable clusters of wild-type and mutant species,particularly in temperature ranges from −50 to −54° C. FIGS. 37A-37B,and 38A-38B demonstrate that primers designed to overlay the mutationsite at the penultimate (−1) base did not perform as well as primerswhich overlay the mutation site at the ultimate (0) base.

To determine the detection limits of the BRAF ddPCR assay, a dilutionseries of mutant DNA (RKO-1) in background of a wild-type control(purified genomic DNA from whole blood) was created with mutant DNAdiluted 2-fold for every dilution. Assays consisting of a mixture of the−BRAF_1799T_(0a:-2t>c) and −BRAF_1799T>A_(0a>t:-2c) were used tointerrogate a mixture of mutant BRAF genomic DNA in a background ofwild-type DNA using an annealing temperature of 54° C. FIGS. 39-40demonstrate detection limits of the BRAF low Tm universal probes withbarcoded primers. FIG. 39 depicts wild-type and mutant concentrationcalls for each sample. Wild-type concentration calls were about 1700copies/μl for each sample. Mutant concentration calls for each dilutedsample decreased steadily, with the lowest limit of quantitation atabout 1.81 copies/W. FIG. 40 depicts fractional abundance of mutant DNAin a wild-type background, as determined by the ddPCR assay. FIG. 40demonstrates that the ddPCR assay can detect a 0.1% fractional abundanceof the BRAF mutant DNA.

Example 19: CNV ddPCR Panel

Digital PCR probe/primer sets were designed to assay copy numbervariation of 19 cancer genes (MET, FGFR1, FGFR2, FLT3, HER3, EGFR, mTOR,CDK4, HER2, RET, HADH, ZFP3, DDR2, AURKA, VEGFA, CDK6, JAK2, BRAF, SRC).Of the 19 cancer genes, 9 are known to also harbor mutations withinregions exhibiting cancer-related gene amplification (MET, FGFR2, EGFR,RET, DDR2, CDK6, JAK2, BRAF, SRC). For these 9 genes, probes weredesigned to overlay the mutation site, and to have greatercomplementarity to the wild-type allele than to the mutant allele. Aprobe/primer set was also included for the housekeeping gene RNaseP. Theprobes/primer sets for the CNV panel, and the genes they correspond to,are shown in Table 17.

TABLE 17 CNV Test Panel Gene Chromosome Forward Primer Reverse PrimerName Location (Limiting) (Excess) Probe MET chr7: AATAAATCATAAGGTCAGCTTTGCACCTGT GAATACT*ATA*G 116423356- CT*T*GCCA*GAGAC TTTGTTGTGTAC(SEQ ID NO: 61) 116423525 ATG (SEQ ID NO: 60) (SEQ ID NO: 59) FGFR1chr8: AATAAATCATAACA* GTTCA*TGTGTAAGG ACTGGA*TGTGC 38282028-CCTCGATGTGCTTTA TGTACAGTG (SEQ ID NO: 64) 38282221 GC (SEQ ID NO: 63)(SEQ ID NO: 62) FGFR2 chr10: GTGGTCGGAGGAGAC AATAAATCATAACTG TACAGTGATGC123279564- GTAGAGT GATGTGGGGCTG (SEQ ID NO: 67) 123279710(SEQ ID NO: 65) (SEQ ID NO: 66) FLT3 chr13: AATAAATCATAAGA*GGTGA*AGATATGTG CATGATATCTCG 28592599- CAACA*TAGT*T*GG A*CTTTGGATTG(SEQ ID NO: 70) 28592731 AATCAC (SEQ ID NO: 69) (SEQ ID NO: 68) HER3chr12: GAAGT*T*T*GCCAT AATAAATCATAACGG ACTCCAGCCAC 56478768- CTTCGTCATGAGCTGGCGCAGAG (SEQ ID NO: 73) 56478977 (SEQ ID NO: 71) (SEQ ID NO: 72)EGFR chr7: AATAAATCATAACA* GGTATTCT*T*T*CT Probe 1: 55259409-GCATGT*CA*AGATC CTTCCGCAC CAAACTGCTGTTGGG 55259571 ACAGAT(SEQ ID NO: 75) CTGGC (SEQ ID NO: 74) (SEQ ID NO: 76) mTOR chr1:AATAAATCATAACTG GCACAATGCAGCCAA TCACACATGTTC 11188060- CTGGACCAGGGTGTTCAAGATTCTG (SEQ ID NO: 79) 11188185 (SEQ ID NO: 77) (SEQ ID NO: 78) CDK4chr12: AATAAATCATAACCA AATAAATCATAACAG CTGAGAT*GGAG 58142966-GTGCAGTCGGTGGTA CAGCTGTGCTCCCGA (SEQ ID NO: 82) 58143102 C(SEQ ID NO: 81) (SEQ ID NO: 80) HER2 chr17: AATAAATCATAACCTAATAAATCATAAGGG TGATGGCTGG 37880950- TGTCCCCAGGAAGCA AGACATATGG*GGAG(SEQ ID NO: 85) 37881176 (SEQ ID NO: 83) C (SEQ ID NO: 84) RET chr10:CTTTA*GGGT*CGGA AATAAATCATAACGT AATGGATGGC 43617375- TTCCAGTT*GGT*GTAGA*TATG (SEQ ID NO: 88) 43617484 (SEQ ID NO: 86) A*TCA(SEQ ID NO: 87) RNaseP chr10: AGGAAGGGCTGA*TA AATAAATCATAACAGGTACCCTTGGA 92632074- GTAA*CTTAG AAGCCGGAGCTGGA (SEQ ID NO: 91) 92632223(SEQ ID NO: 89) (SEQ ID NO: 90) HADH chr4: AATAAATCATAACTCAATAAATCATAAGAT ACCAAGTCTGTG AP525 108935580- (I07)ACGATGGCTTGCAGCCTCCGTTGT (SEQ ID NO: 94) 108935749 CCAC (SEQ ID NO: 93)(SEQ ID NO: 92) ZFP3 chr17: AATAAATCATAACT GAGTTTGGAGCAGGA TCCAACATGTCAP525 4994800- (I07)CCA*TGGACT TGTGAAGAAG (SEQ ID NO: 97) 4995200CTCTCGA (SEQ ID NO: 96) (SEQ ID NO: 95) DDR2 chr1: AATAAATCATAATGCGGA(I07)ATCT AGAATTAGGG AP525 162745438- GTACATCGCTGGAGG (I07)AATCAGT*T*(SEQ ID NO: 100) 162745640 (SEQ ID NO: 98) T*CTTTCC (SEQ ID NO: 99)AURKA chr20: AATAAATCATAA GGGTTTA*TAAATGT TAAATTGAATA*A* AP525 54963161-(I07)TGCAT*T*T* GA*ATGA*GATTACA (SEQ ID NO: 103) 54963260 CA(I07)GACCTGTG (SEQ ID NO: 101) (SEQ ID NO: 102) VEGFA chr6: GTGGTGAAGTTCATGAATAAATCATAACCA GCTACTGCCATC FAM 43745202- GATGTCTATC CCAGGGTCTCGATTG(SEQ ID NO: 106) 43745408 (SEQ ID NO: 104) G (SEQ ID NO: 105) CDK6 chr7:CATGTCGATCAAGAC AATAAATCATAATCA TAAAGTTCCAG AP525 92403984-TTGACCACTTACTT GTGGGCACTCCAGG (SEQ ID NO: 109) 92404134 (SEQ ID NO: 107)(SEQ ID NO: 108) JAK2 chr9: CAAGCTTTCTCACAA AATAAATCATAACTT GGAGTATGTGTCAP525 5073695- GCATTTGGT A*CTCTCGTCTCCAC (SEQ ID NO: 112) 5073789(SEQ ID NO: 110) AG (SEQ ID NO: 111) BRAF chr7: GACAACTGTTCAAACAATAAATCATAAGGT ATTTCACTGTA AP525 140453074- TGATGGGAC GATT*T*T*GGTCTA(SEQ ID NO: 115) 140453195 (SEQ ID NO: 113) GCTAC (SEQ ID NO: 114) SRCchr20: CGGTTACTGCTCAAT AATAAATCATAAC AACCCGAGAG FAM 36022571- GCAGAG(I07)TGGTCTCACT (SEQ ID NO: 118) 36022750 (SEQ ID NO: 116) TTCT(I07)GCA(SEQ ID NO: 117)

A numerical analysis was performed to determine the minimum inputrequirements for a 20,000 partition digital PCR experiment. Thisanalysis examined the ability to detect a 2-fold difference inconcentration between a target gene and a reference gene within a tumorpopulation for a sample with various levels of tumor burden. Results ofthe numerical analysis are shown in FIG. 39. The upper and lower boundsof significance ensuring a p-value of <0.0001 (z-score≧3.891) were thendetermined at various input concentrations. A 2-fold difference inconcentration between a target gene and a reference gene with a p-valueof <0.0001 can be detected in a DNA sample originating from a tissuesample having 40% tumor burden, wherein the DNA sample comprises 20copies/μL of RNaseP, corresponding to 0.06 ng/μL DNA (FIG. 41).Similarly, a 2-fold difference in concentration between a target geneand a reference gene with a p-value of <0.0001 can be detected in a DNAsample originating from a tissue sample having 20% tumor burden, whereinthe DNA sample comprises 50 copies/μL of RNaseP, corresponding to 0.15ng/μL DNA. Since 2.2 μL of sample is introduced per 22 μL assay volume,it is estimated that the CNV ddPCR assay can detect a gene amplificationfrom as little as 0.6 ng/μL of purified FPET DNA material.

The CNV assay assigns a target gene i as “not amplified” if the expectedvalues of the target gene μ_(i) is the same as the expected value of thereference gene μ_(j):

H ₀:μ_(i)=μ_(j)

If the null hypothesis is not satisfied, the target gene i is assignedas “amplified”. However, as the number of positive and negative countsfollow a binomial distribution, the criteria for acceptance can beevaluated by application of a t-test to the proportion of negativedroplets p_(i,neg) and p_(j,neg) from target gene i and reference genej, respectively, to derive a standard (zi) score:

If the null hypothesis is not satisfied, the target gene i is assignedas “amplified”. However, as the number of positive and negative countsfollow a binomial distribution, the criteria for acceptance can beevaluated by application of a t-test to the proportion of negativedroplets p_(i,neg)p_(i,neg) and p_(j,neg) p_(j,neg) from target gene iand reference gene j, respectively, to derive a standard (z_(i)) score:

$z_{i} = {{\frac{p_{i,{neg}} - {\overset{\_}{p}}_{j,{neg}}}{\sqrt{\sigma_{i,{neg}}^{2} + {\overset{\_}{\sigma}}_{j,{neg}}^{2}}}z_{i}} = {{\frac{p_{i,{neg}} - {\overset{\_}{p}}_{j,{neg}}}{\sqrt{\sigma_{i,{neg}}^{2} + {\overset{\_}{\sigma}}_{j,{neg}}^{2}}}z_{i}} = \frac{p_{i,{neg}} - {\overset{\_}{p}}_{j,{neg}}}{\sqrt{\sigma_{i,{neg}}^{2} + {\overset{\_}{\sigma}}_{j,{neg}}^{2}}}}}$

If the standard score z, 3.891, then the target gene i is “amplified” ata p<0.0001 (i.e., 99.99% CI)

For the BRAF gene, the assay is designed to a region on the BRAF gene onchromosome 7 that has an off-target homology to a region on the Xchromosome. Thus, the total concentration of BRAF observed is acontribution of both targets:

c _(BRAF,tot) =c _(BRAF,chr7) +c _(BRAF,chrX)

c _(BRAF,tot) =m·c _(ref) +n·c _(ref) c _(BRAF,tot) =m·c _(ref) +n·c_(ref)

c _(BRAF,tot)=(m+n)·c _(ref)

c _(BRAF,tot) =c _(BRAF,chr7) +c _(BRAF,chrX)

c _(BRAF,tot) =m·c _(ref) +n·c _(ref)

c _(BRAF,tot)=(m+n)·c _(ref)

where m represents the fold-amplification versus the reference value,and n represents the number of copies on the X-chromosome. This can berelated to the expected values in Poisson space:

$\mspace{20mu} {{{- \frac{1000}{V}}{\ln \left( p_{{neg},{BRAF}} \right)}} = {{{{- \frac{1000}{V}}\left( {m + n} \right){\ln \left( p_{{neg},{ref}} \right)}}\mspace{20mu} - {\frac{1000}{V}{\ln \left( p_{{neg},{BRAF}} \right)}}} = {{- \frac{1000}{V}}\left( {m + n} \right){\ln \left( p_{{neg},{ref}} \right)}}}}$ln (p_(neg, BRAF)) = (m + n)ln (p_(neg, ref))ln (p_(neg, BRAF)) = (m + n)ln (p_(neg, ref))  p_(neg, BFAF) = p_(neg, ref)^(m + n)p_(neg, BRAF) = p_(neg, ref)^(m + n)

For a “normal” sample, m=1. Due to the presence of a pseudogene for BRAFon the X-chromosome, n=0.5 for male, n=1 for female. Therefore, theexpected “normal” value of BRAF occurs when 1+n=1.5 or 2.0

If the standard score z_(i) is ≧3.891, then the target gene i is“amplified” at a p<0.0001 (i.e., 99.99% CI)

Example 20: Use of CNV ddPCR Panel for Selecting Effective CancerTreatment

A patient presented with metastatic colon cancer. The colon cancer hadmetastasized to the patient's liver. Five different types ofchemotherapy treatments had been attempted without success. A liverbiopsy suspected of containing cancerous tissue was obtained from thepatient and fresh frozen. DNA was extracted from the liver biopsy andquantitated. Sample DNA from the patient was then subjected to ddPCRusing the primer/probe sets for VEGFA, EGFR, CDK6, MET, BRAF, FGFR1,JAK2, HER3, CDK4, HER2, SRC, and AURKA, outlined in Table 16 (above).PCR thermocycler conditions were as follows: 10 minutes at 95° C. (100%ramp rate), followed by 45 cycles of (30 seconds at 95° C., 60 secondsat 60° C.) followed by 5 minutes at 70° C., followed by 25° C. hold.Droplets were enumerated by Quantasoft. The concentration of target andreference genes were calculated using the following equation for eachgene i:

${p_{i,{neg}} = \frac{N_{i,{neg}}}{N_{i,{tot}}}},{\sigma_{i,{neg}} = \sqrt{\frac{N_{i,{neg}}}{N_{i,{tot}}}{\left( {1 - \frac{N_{i,{neg}}}{N_{i,{tot}}}} \right)/N_{i,{tot}}}}}$${p_{i,{neg}} = \frac{N_{i,{neg}}}{N_{i,{tot}}}},{\sigma_{i,{neg}} = {{\sqrt{\frac{N_{i,{neg}}}{N_{i,{tot}}}{\left( {1 - \frac{N_{i,{neg}}}{N_{i,{tot}}}} \right)/N_{i,{tot}}}}\therefore p_{i,{neg},{99.99\% \mspace{14mu} {CI}}}} = {{{p_{i,{neg}} \pm {3.891 \cdot \sigma_{i,{neg}}}}\therefore p_{i,{neg},{99.99\% \mspace{14mu} {CI}}}} = {p_{i,{neg}} \pm {3.891 \cdot \sigma_{i,{neg}}}}}}}$${p_{i,{neg}} = \frac{N_{i,{neg}}}{N_{i,{tot}}}},{\sigma_{i,{neg}} = {{\sqrt{\frac{N_{i,{neg}}}{N_{i,{tot}}}{\left( {1 - \frac{N_{i,{neg}}}{N_{i,{tot}}}} \right)/N_{i,{tot}}}}\therefore p_{i,{neg},{99.99\% \mspace{14mu} {CI}}}} = {p_{i,{neg}} \pm {3.891 \cdot \sigma_{i,{neg}}}}}}$

where p_(i,neg) is the proportion of negative droplets, where

is the number of negative events, σ_(i,neg) is the standard deviation ofthe proportion measurement,

is the number of accepted events for each gene i as determined byQuantaSoft, and

∴ p

_(i,neg,99.99% CI) is the lower and upper bound of the proportionmeasurement. concentration of each species c was converted toconcentration units (copies/μL) according to the following relationship:

$c = {{- \frac{1000}{V}}{\ln \left( p_{neg} \right)}}$

where V represents the volume of the partition/droplet.

Results from the CNV ddPCR assay are shown in FIGS. 42A-42B. FIG. 42Adepicts concentration of 12 of the CNV cancer genes, and FIG. 42Bdepicts copy number of the 12 genes in the patient sample. A dramaticamplification of the HER2 gene was revealed by the CNV ddPCR assay. TheHER2 amplification was reported to the patient's doctor. Based on theresults of the CNV ddPCR assay, the doctor prescribed the breast cancerdrug T-DMI. FIGS. 43A-43D depict image scans of the patient's livertaken after chemotherapy treatment regimens 1 and 2 (FIG. 43A), takenafter chemotherapy treatment regimens 3-5 (FIG. 43B), taken after thepatient received two doses of the T-DMI (FIG. 43C), and taken after thepatient received the third dose of T-DMI (FIG. 43D). FIG. 43A revealstwo dark spots in the liver, indicative of cancerous tissue. FIG. 43Breveals that despite chemotherapy regimens 3-5, the cancerous growthsincreased dramatically in size. FIG. 43C reveals that after two doses ofT-DMI, the cancerous growths had shrunk by at least ˜50%. FIG. 43Dreveals that after the third dose of T-DMI, the cancerous growths wereundetectable by image scan.

Example 21: Detection of Copy Number Variation and Gene Mutation Using aSingle Assay

The CNV primer/probe set for EGFR as depicted in Table 1 was used toassay both copy number variation and the presence of mutant EGFR in acancer patient sample. The EGFR probe overlays a site known to harbor acancer-related mutation and has a sequence corresponding to thewild-type allele. ddPCR was conducted as described herein (see, e.g.,Example 20). FIG. 44 A depicts results of the assay. Because of amismatch between the EGFR probe and the mutant allele, the probe hadlower binding efficiency to the mutant allele, resulting in a cluster ofddPCR droplets with distinguishably lower fluorescence intensity. FIG.44 B depicts quantitation results from the assay. The high-intensitycluster of EGFR positive droplets were enumerated as wild-type, thelow-intensity cluster of EGFR positive droplets were enumerated asmutant. The sample was determined to contain 267 copies/μl total EGFR(wt+mu), with an equal proportion of wt and mu EGFR. EGFR also exhibiteda 2-fold gene amplification from 2 to 4.12.

Example 22: 5′-Adaptor Ligation of gDNA Fragments, TSO Hybridization andExpansion

Three uL of 5′-phosphorylated fragmented genomic DNA (10-1,000 ng) wasdenatured for 3 min at 95° C. and then cooled on ice. The denatured DNAwas pre-adenylated by adding the denatured DNA and 40% (w/v) PEG-8000(3.0 uL) to a mixture of 10× CircLigase II buffer (0.8 uL), 4 mM ATP(0.2 uL), CircLigase II (0.8 uL) and glycogen (0.2 uL) and incubatingthe resulting adenylation mixture for 5 min at 60° C. An adaptor mixturecontaining an adaptor (2.0 uL), 50 mM MnCl₂ (8.0 uL), water (29.6 uL),10× CircLigase II buffer (8.0 uL), 10% Tween-20 (0.4 uL), glycogen (2.0uL) and 40% (w/v) PEG-8000 (30.0 uL) was pre-incubated for 5 min at 60°C. and then transferred quickly to the adenylation reaction mixture withvigorous vortexing. The resulting adaptor-ligation reaction mixture wasincubated for 60 min at 60° C.

To purify the 5′-adapted ssDNA fragments, an equal volume of SeraLIGsolution (4 M NaCl, 10 mM Tris of pH 7.4, 10 mM EDTA and 0.05% (v/v)Tween-20) was added to the adaptor-ligation reaction mixture. Theresulting mixture was incubated for 10 min at room temperature (RT), thesample tube was magnetized for 5 min, and the supernatant was removed.The magnetized pellet was washed twice with 70% ethanol and air-driedfor 5 min at RT. DNA was eluted off the beads with 10 uL of 10 mM Tris(pH 7.4).

For hybridization of a target-selective oligonucleotide (TSO) to atarget DNA fragment, a TSO hybridization mixture containing 2×GC mix(4.0 uL), a TSO set (0.5 uL), 40% PEG-8000 (1.0 uL) and 5′-adapted ssDNAfragments (2.5 uL) was incubated under the following thermocyclingprogram: 1 min, 95° C.; −1.0° C./cycle; 35 cycles; 60° C. hold. Apolymerase mixture containing 2×GC mix (1.0 uL), 10 mM dNTP mix (0.5 uL)and Phusion Hot Start polymerase (0.5 uL) was pre-incubated for 30 minat 60° C. and then added to the TSO hybridization mixture. The resultingmixture was incubated for 10 min at 60° C., followed by a 4° C. hold,and then an expansion (or extension) mixture containing 2×GC mix (20.0uL), an expansion set (2.5 uL), 10 mM dNTP mix (1.0 uL) and water (16.5uL) was added. The resulting expansion reaction mixture was incubatedunder the following thermocycling conditions: 5 sec, 98° C.; 10 sec, 65°C.; 30 sec, 72° C.; 15 cycles; 5 min, 72° C.; 4° C. hold.

Upon completion of the expansion reaction, 1.5× volume of SeraPURsolution (2 M NaCl, 18% (w/v) PEG-8000, 0.2% (w/v) SeraMag beads, 10 mMTris of pH 8.0, 10 mM EDTA and 0.05% (v/v) Tween-20) was added to thereaction mixture. The resulting mixture was incubated for 10 min at RT,the sample tube was magnetized for 5 min, and the supernatant wasremoved. The magnetized pellet was washed twice with 70% ethanol andair-dried for 5 min at RT. The resulting sequencing library was elutedoff the beads with 10 uL of TET (10 mM Tris of pH 8.0, 1 mM EDTA and0.05% (v/v) Tween-20).

Example 23: 5′-End Ligation of gDNA Fragments to Solid Phase-BoundAdaptor, TSO Hybridization and Expansion

To prepare solid phase-bound 5′-adaptors, 40 uL of Streptavidin MyOne C1Dynabeads were washed thrice with 1×BW buffer (1 M NaCl, 5 mM Tris of pH7.4, and 0.5 mM EDTA) supplemented with 0.1 mg/mL (final) BSA and thenre-suspended in 80 uL of 2×BW buffer (2 M NaCl, 10 mM Tris of pH 7.4,and 1.0 mM EDTA). One nmol of a 5′-biotinylated adaptor in 80 uL of TEbuffer (10 mM Tris of pH 8.0 and 1 mM EDTA) was incubated with there-suspended beads for 15 min at room temperature (RT). The5′-adaptor-bound MyOne C1 beads were washed twice with 1×BW bufferfollowed by a single wash with NEB4 buffer (50 mM K.acetate, 20 mMTris.acetate, 10 mM Mg.acetate and 1 mM DTT, pH=7.9) supplemented with0.1 mg/mL (final) BSA. The adaptor-bound C1 beads were then re-suspendedin an adaptor mixture containing 10×NEB4 buffer (8.0 uL), 50 mM MnCl₂(8.0 uL), 10% Tween-20 (0.4 uL), glycogen (2.0 uL), water (31.6 uL), and40% (w/v) PEG-8000 (30 uL).

Three uL of 5′-phosphorylated fragmented genomic DNA (10-1,000 ng) wasdenatured for 3 min at 95° C. and then cooled on ice. The denatured DNAwas pre-adenylated by adding the denatured DNA and 40% (w/v) PEG-8000(3.0 uL) to a mixture of 10× CircLigase II buffer (0.8 uL), 4 mM ATP(0.2 uL), CircLigase II (0.8 uL) and glycogen (0.2 uL) and incubatingthe resulting adenylation mixture for 5 min at 60° C. The adaptormixture prepared above was pre-incubated for 5 min at 60° C. and thentransferred quickly to the adenylation reaction mixture with vigorousvortexing. The resulting adaptor-ligation reaction mixture was incubatedfor 60 min at 60° C.

Upon completion of the adaptor-ligation reaction, the beads weremagnetized for 5 min at RT, followed by incubation with 1×BWsupplemented with 0.5% SDS and 0.05% Tween-20 for 15 min at 60° C. Thebeads then were washed once with 0.1×BW buffer (0.1 M NaCl, 5 mM Tris ofpH 7.4, and 0.5 mM EDTA) supplemented with 0.5% SDS and 0.05% Tween-20,then twice with 0.1×BW buffer supplemented with 0.05% Tween-20, andfinally once with EB-T buffer (10 mM Tris of pH 8.0 and 0.05% Tween-20).

For hybridization of a target-selective oligonucleotide (TSO) to atarget DNA fragment ligated at the 5′-end to a bead-bound adaptor, thebeads bound to 5′-adapted ssDNA fragments were re-suspended in a TSOhybridization mixture containing 2×GC mix (4.0 uL), a TSO set (0.5 uL),40% PEG-8000 (1.0 uL) and nuclease-free water (2.5 uL), and thesuspension was incubated under the following thermocycling program: 1min, 95° C.; −1.0° C./cycle; 35 cycles; 60° C. hold. A polymerasemixture containing 2×GC mix (1.0 uL), 10 mM dNTP mix (0.5 uL) andPhusion Hot Start polymerase (0.5 uL) was pre-incubated for 30 min at60° C. with mixing every 10 min, and then was added to the TSOhybridization suspension. The resulting suspension was incubated for 10min at 60° C., followed by a 4° C. hold, and then an expansion (orextension) mixture containing 2×GC mix (20.0 uL), an expansion set (2.5uL), 10 mM dNTP mix (1.0 uL) and water (16.5 uL) was added. Theresulting expansion reaction suspension was incubated under thefollowing thermocycling conditions: 5 sec, 98° C.; 10 sec, 65° C.; 30sec, 72° C.; 15 cycles; 5 min, 72° C.; 4° C. hold.

Upon completion of the expansion reaction, 1.5× volume of SeraPURsolution (2 M NaCl, 18% (w/v) PEG-8000, 0.2% (w/v) SeraMag beads, 10 mMTris of pH 8.0, 10 mM EDTA and 0.05% (v/v) Tween-20) was added to thereaction suspension. The resulting suspension was incubated for 10 minat RT, the sample tube was magnetized for 5 min, and the supernatant wasremoved. The magnetized pellet was washed twice with 70% ethanol andair-dried for 5 min at RT. The resulting sequencing library was elutedoff the beads with 10 uL of TET (10 mM Tris of pH 8.0, 1 mM EDTA and0.05% (v/v) Tween-20).

Example 24: 5′-Adaptor Ligation of RNA Fragments, TSO Hybridization,Reverse Transcription and Expansion

Total RNA (10-1,000 ng in a final volume of 40 uL) was fragmented byincubation at 94° C. for 8 min, followed by rapid cooling at 4° C. on athermocycler equipped with a heated lid. The 5′-end of the RNA fragmentswas phosphorylated using T4 polynucleotide kinase in the presence ofRNaseIn (Ambion) for 30 min at 37° C. at a final volume of 50 uL in 1×T4RNA ligase buffer with 1 mM ATP and 5% (w/v) PEG-8000.

To purify the 5′-phosphorylated RNA fragments, an equal volume ofSeraPUR RNA solution (2 M LiCl, 18% (w/v) PEG-8000, 0.2% (w/v) SeraMagbeads, 10 mM Tris of pH 7.4, 10 mM EDTA and 0.05% (v/v) Tween-20) wasadded to the phosphorylation reaction mixture. The resulting mixture wasincubated for 10 min at room temperature (RT), the sample tube wasmagnetized for 5 min, and the supernatant was removed. The magnetizedpellet was washed twice with 75% ethanol and air-dried for 5 min at RT.RNA was eluted off the beads with 6 uL of 10 mM Tris (pH 7.4).

The 5′-phosphorylated RNA fragments were pre-adenylated by adding theRNA fragments (3.0 uL) and 40% (w/v) PEG-8000 (3.0 uL) to a mixture of10× CircLigase II buffer (0.8 uL), 4 mM ATP (0.2 uL), CircLigase II (0.8uL) and glycogen (0.2 uL) and incubating the resulting adenylationmixture for 5 min at 60° C. An adaptor mixture containing an RNA adaptor(2.0 uL), water (37.6 uL), 10× CircLigase II buffer (8.0 uL), 10%Tween-20 (0.4 uL), glycogen (2.0 uL) and 40% (w/v) PEG-8000 (30.0 uL)was pre-incubated for 5 min at 60° C. and then transferred quickly tothe adenylation reaction mixture with vigorous vortexing. The resultingadaptor-ligation reaction mixture was incubated for 60 min at 60° C.

To purify the 5′-adapted RNA fragments, an equal volume of SeraLIG RNAsolution (4 M LiCl, 10 mM Tris of pH 7.4, 10 mM EDTA and 0.05% (v/v)Tween-20) was added to the adaptor-ligation reaction mixture. Theresulting mixture was incubated for 10 min at RT, the sample tube wasmagnetized for 5 min, and the supernatant was removed. The magnetizedpellet was washed twice with 75% ethanol and air-dried for 5 min at RT.RNA was eluted off the beads with 10 uL of 10 mM Tris (pH 7.4).

Two target-selective oligonucleotides (TSOs) (0.5 uM final) wereincubated with 5 uL of 5′-adapted RNA fragments for 5 min at 65° C. at afinal volume of 10 uL in the presence of 1 mM dNTPs, and then thehybridization mixture was placed on ice. The RNA fragments hybridized tothe TSOs were reverse-transcribed for 50 min at 50° C. following theaddition of a mixture containing 10× Reverse Transcription buffer (2.0uL), 25 mM MgCl₂ (4.0 uL), 0.1 M DTT (2.0 uL), SUPERaseIn (40 U/uL) (1.0uL) and SuperScript® III Reverse Transcriptase (200 U/uL) (1.0 uL) tothe hybridization mixture. After heat inactivation for 5 min at 85° C.,RNA was degraded by adding 1 uL of RNase H to the reverse transcriptionreaction mixture and incubating the resulting mixture for 20 min at 37°C. Two uL of the mixture containing single-stranded cDNA molecules wasadded to an expansion mixture containing 2× Phusion GC mix (20.0 uL), anexpansion set (2.5 uL), 10 mM dNTP mix (1.0 uL), Phusion Hot Startpolymerase (0.5 uL) and water (14.0 uL). The expansion reaction mixturewas incubated under the following thermocycling conditions: 5 sec, 98°C.; 10 sec, 65° C.; 30 sec, 72° C.; 15 cycles; 5 min, 72° C.; 4° C.hold.

Upon completion of the expansion reaction, 1.5× volume of SeraPURsolution (2 M NaCl, 18% (w/v) PEG-8000, 0.2% (w/v) SeraMag beads, 10 mMTris of pH 8.0, 10 mM EDTA and 0.05% (v/v) Tween-20) was added to thereaction mixture. The resulting mixture was incubated for 10 min at RT,the sample tube was magnetized for 5 min, and the supernatant wasremoved. The magnetized pellet was washed twice with 70% ethanol andair-dried for 5 min at RT. cDNA corresponding to a bona fide sequencinglibrary was eluted off the beads with 10 uL of TET (10 mM Tris of pH8.0, 1 mM EDTA and 0.05% (v/v) Tween-20).

Alternatively, both reverse transcription and expansion can be performedusing Tth DNA polymerase. Tth DNA polymerase has reverse transcriptaseactivity in the presence of Mn²⁺ ions, allowing PCR amplification fromRNA targets. To this end, 5′-adapted RNA fragments (5 uL) hybridized toeither of two TSOs (1.5 uL) were reverse-transcribed by incubation for30 min at 65° C. in a mixture containing Tth DNA polymerase (0.8 uL), 10mM dNTPs (0.4 uL), 9 mM MnCl₂ (2.0 uL), 0.1 M DTT (2.0 uL), 10× ReverseTranscription buffer (2.0 uL) and nuclease-free water (12.1 uL). Then anexpansion mixture containing 10×PCR buffer (8.0 uL), an expansion set(4.0 uL), 7.5 mM EGTA (10.0 uL) and water (14.0 uL) was added to thereverse transcription reaction mixture at RT. The expansion reactionmixture was incubated under the following thermocycling conditions: 60sec, 94° C.; 30 sec, 94° C.; 30 sec, 65° C.; 45 sec, 72° C.; 15 cycles;7 min, 72° C.; 4° C. hold.

Upon completion of the expansion reaction, 1.5× volume of SeraPURsolution (2 M NaCl, 18% (w/v) PEG-8000, 0.2% (w/v) SeraMag beads, 10 mMTris of pH 8.0, 10 mM EDTA and 0.05% (v/v) Tween-20) was added to thereaction mixture. The resulting mixture was incubated for 10 min at RT,the sample tube was magnetized for 5 min, and the supernatant wasremoved. The magnetized pellet was washed twice with 70% ethanol andair-dried for 5 min at RT. cDNA corresponding to a bona fide sequencinglibrary was eluted off the beads with 10 uL of TET (10 mM Tris of pH8.0, 1 mM EDTA and 0.05% (v/v) Tween-20).

Example 25: Ligation of Genomic DNA Fragments to 3′-Adaptors

200 pmol of 3′-adaptors, synthesized with a 5′-terminal phosphate groupand a 3′-end blocking group (biotin-TEG), were pre-adenylated by addingin order the components shown in Table A and incubated for 5 min at 60°C.

TABLE A Adenylation mixture (DNA sample) Reagents Volume (μL) 10xCircLigase II buffer 0.8 4 mM ATP 0.2 CircLigase II 0.8 Glycogen 0.2Nuclease-free water 1.0 adaptor (100 μM) 2.0 40% (w/v) PEG-8000 3.0TOTAL 8.0

Fragmented and repaired genomic DNA (10-1000 ng), following heatdenaturation for 3 min at 95° C. and rapid cooling on ice, was thenassembled in order into the DNA mixture shown in Table B, and kept onice.

TABLE B Adaptor mixture (DNA sample) Reagents Volume (μL) Fragmented DNA30.0 Nuclease-free water Nuclease-free water 1.6 50 mM MnCl₂ 8.0 10xNEB4 buffer 8.0 10% Tween 20 0.4 Glycogen 2.0 40% (w/v) PEG-8000 30.0TOTAL 80.0

Following pre-incubation of the DNA mixture for 5 min at 60° C., theentire contents were then transferred quickly to the Adenylationreaction mixture with vigorous vortexing, and incubated for 60 min at60° C.

Samples were then purified by adding an equal volume of SeraLIG solution(4 M NaCl, 10 mM Tris pH=7.4, 10 mM EDTA and 0.05% (v/v) Tween-20).Following incubation for 10 min at room temperature, sample tubes weremagnetized for 5 min, and the supernatant removed. The magnetized pelletwas then washed twice with 70% ethanol and air-dried for 5 min at roomtemperature. DNA was then eluted off the beads with 40 μL of 10 mM TrispH=7.4.

5′-ends were then phosphorylated with 50 U of T4 polynucleotide kinasein the presence of RNaseIn (Ambion) for 30 min at 37° C. at a finalvolume of 50 μL in 1×T4 RNA ligase buffer with 1 mM ATP and 5% (w/v)PEG-8000.

Samples were then purified by adding an equal volume of SeraPUR solution(2 M NaCl, 18% (w/v) PEG-8000, 0.2% (w/v) SeraMag beads, 10 mM TrispH=7.4, 10 mM EDTA and 0.05% (v/v) Tween-20). Following incubation for10 min at room temperature, sample tubes were magnetized for 5 min, andthe supernatant removed. The magnetized pellet was then washed twicewith 70% ethanol and air-dried for 5 min at room temperature. DNA wasthen eluted off the beads with 6 μL of 10 mM Tris pH=8.0.

Example 26: Ligation of 3′-Adapted Genomic DNA Fragments to 5′-Adaptors

The Adaptor mixture shown in Table C was assembled in order, and kept onice.

TABLE C Adaptor mixture (DNA sample) Reagents Volume (μL) adaptor 2.0 50mM MnCl₂ 8.0 Water 29.6 10x CircLigase II buffer 8.0 10% Tween 20 0.4Glycogen 2.0 40% (w/v) PEG-8000 30.0 TOTAL 80.0

3 μL of 5′-phosphorylated fragmented DNA (5 ng to 500 ng) was thendenatured for 3 min at 95° C., and cooled on ice.

Denatured DNA was then pre-adenylated by adding in order the componentsshown in Table D and incubated for 5 min at 60° C.:

TABLE D Adenylation mixture (DNA sample) Reagents Volume (μL) 10xCircLigase II buffer 0.8 4 mM ATP 0.2 CircLigase II 0.8 Glycogen 0.2Denatured DNA 3.0 40% (w/v) PEG-8000 3.0 TOTAL 2.0

Following pre-incubation of the Adaptor mixture for 5 min at 60° C., theentire contents were then transferred quickly to the Adenylationreaction mixture with vigorous vortexing, and incubated for 60 min at60° C.

Samples were then purified by adding an equal volume of SeraLIG solution(4 M NaCl, 10 mM Tris pH=7.4, 10 mM EDTA and 0.05% (v/v) Tween-20).Following incubation for 10 min at room temperature, sample tubes weremagnetized for 5 min, and the supernatant removed. The magnetized pelletwas then washed twice with 70% ethanol and air-dried for 5 min at roomtemperature. DNA was then eluted off the beads with 10 μL of 10 mM TrispH=7.4.

Example 27: Ligation of 5′-Adapted Genomic DNA Fragments to 3′-Adaptors

200 pmol of 3′-adaptors, synthesized with a 5′-terminal phosphate groupand a 3′-end blocking group (biotin-TEG), were pre-adenylated by addingin order the components shown in Table E and incubated for 5 min at 60°C.

TABLE E Adenylation mixture (DNA sample) Reagents Volume (μL) 10xCircLigase II buffer 0.8 4 mM ATP 0.2 CircLigase II 0.8 Glycogen 0.2Nuclease-free water 1.0 adaptor (100 μM) 2.0 40% (w/v) PEG-8000 3.0TOTAL 8.0

5′-adapted genomic DNA (10-1000 ng), following heat denaturation for 3min at 95° C. and rapid cooling on ice, was then assembled in order intothe DNA mixture shown in Table F, and kept on ice.

TABLE F Adaptor mixture (DNA sample) Reagents Volume (μL) Fragmented DNA30.0 Nuclease-free water Nuclease-free water 1.6 50 mM MnCl₂ 8.0 10xNEB4 buffer 8.0 10% Tween 20 0.4 Glycogen 2.0 40% (w/v) PEG-8000 30.0TOTAL 80.0

Following pre-incubation of the DNA mixture for 5 min at 60° C., theentire contents were then transferred quickly to the Adenylationreaction mixture with vigorous vortexing, and incubated for 60 min at60° C.

Samples were then purified by adding an equal volume of SeraLIG solution(4 M NaCl, 10 mM Tris pH=7.4, 10 mM EDTA and 0.05% (v/v) Tween-20).Following incubation for 10 min at room temperature, sample tubes weremagnetized for 5 min, and the supernatant removed. The magnetized pelletwas then washed twice with 70% ethanol and air-dried for 5 min at roomtemperature. DNA was then eluted off the beads with 40 μL of 10 mM TrispH=7.4.

Example 28: Case Study 1

A mid-forties individual presents with metastatic lung cancer (lungadenocarcinoma). A fresh core needle biopsy is taken from the lung(right, lower lobe nodule). The biopsy is placed in a storage solutionand shipped for analysis at ambient temperature. Purified DNA is shearedto 600 bp and a library is generated and sequenced using methodsdescribed herein. A subset of copy number alterations are orthogonallymeasured using ddPCR. FIG. 58A illustrates an alteration identified inERCC6 that results in a Q1431R change and an alteration in AURKA thatresults in an F31I change. FIG. 58B illustrates a box whisker plotshowing the distribution of gene ratios observed in the sample across 96genes. The box portion of the plot indicates one standard deviation, andwhiskers show two standard deviations. Points outside box whisker plotare outliers from the observed distribution of ratios. The right plotshow individual ratios with corresponding log-normal disturbing curve.FIG. 58C illustrates a comparison between ratio values called across 12genes with a library formation and DNA sequencing technique providedherein versus ddPCR.

Example 29: Case Study 2

A mid-forties individual presents with esophageal cancer (esophagealadenocarcinoma). A volume of 10 mL of blood is collected in Strecktubes, and 4 mL of plasma is recovered. Cell-free DNA (14 ng) is used togenerate a library for sequencing using methods described herein. FIG.59A illustrates a box whisker plot showing the distribution of generatios observed in the sample across 96 genes. The box portion of theplot indicates one standard deviation, and whiskers show two standarddeviations. Points outside box whisker plot are outliers from theobserved distribution of ratios. FIG. 59B illustrates the results of aninterrogation of the TCGA dataset (www.cbioportal.com) for theprevalence of CCND1 amplification, which reveals the highest incidenceof CCND1 amplifications in esophageal cancer.

Example 30: Reference Materials for Circulating DNA

In order to generate a reference material suitable for validation andbenchmarking of cfDNA detection and sequencing assays, DNA is extractedfrom cell lines from reference germline genomes, e.g., the Ashkenazifather and son from the NIST Genome-in-a-Bottle Consortium. These DNAsamples are mixed in several dilutions that approximate the mixturesfrom tumor DNA in the background of germline ‘normal’ DNA in a cancerpatient, or mother/fetus DNA mixtures present at different times inpregnancy. The better known sample (e.g., the son of the Ashenazi trio)is typically diluted down to a proportion 0.5-1%. Such proportions canalso be 0.01%-0.1%, 0.01%-0.2%, 0.01%-0.3%, 0.01%-0.4%, 0.01%-0.5%,0.5-1%, 1%-1.5%, 1.5%-2%, 2%-3%, 3-4%, 4%-5%. These proportions can be1-5 haploid copies, 1-10 haploid copies, 5-10 haploid copies, 10-20haploid copies, 10-50 haploid copies, 20-50 haploid copies, 30-50haploid copies, 40-50 haploid copies, 50-75 haploid copies, or 75-100haploid copies of the rarer genome in the mixture.

In some cases the DNA is extracted as roughly intact chromatin (e.g.without protein removal to remove histones). In other cases, the DNA isextracted and chromatin is reconstructed in vitro by incubation of theDNA with purified histones and chromatin assembly factors. For example,the Active Motif Chromatin Assembly kit can be used. The reassembledchromatin can then be treated with a DNase, such as DNase I, similar toprotocols performed for hypersensitivity footprinting to create adegradation patterns similar to those found in cell-free DNA. The batchof partially degraded DNA is diluted to create different referencestocks with aliquots containing a minimum of 10-50 haploid copies of therarer genome in the mixture. In some cases, the reassembled chromatin issheared using a nebulizer.

In other cases, a reference material can be generated using FFPEreference materials and combined in different proportions. Suchproportions can also be 0.01%-0.1%, 0.01%-0.2%, 0.01%-0.3%, 0.01%-0.4%,0.01%-0.5%, 0.5-1%, 1%-1.5%, 1.5%-2%, 2%-3%, 3-4%, 4%-5%. Theseproportions can be 1-5 haploid copies, 1-10 haploid copies, 5-10 haploidcopies, 10-20 haploid copies, 10-50 haploid copies, 20-50 haploidcopies, 30-50 haploid copies, 40-50 haploid copies, 50-75 haploidcopies, or 75-100 haploid copies of the rarer genome in the mixture.

In other cases, reference materials can be plasma from volunteers.Cell-free DNA can be extracted from the volunteers and combined indifferent proportions. Such proportions can also be 0.01%-0.1%,0.01%-0.2%, 0.01%-0.3%, 0.01%-0.4%, 0.01%-0.5%, 0.5-1%, 1%-1.5%,1.5%-2%, 2%-3%, 3-4%, 4%-5%. These proportions can be 1-5 haploidcopies, 1-10 haploid copies, 5-10 haploid copies, 10-20 haploid copies,10-50 haploid copies, 20-50 haploid copies, 30-50 haploid copies, 40-50haploid copies, 50-75 haploid copies, or 75-100 haploid copies of therarer genome in the mixture.

1. (canceled)
 2. (canceled)
 3. (canceled)
 4. (canceled)
 5. (canceled) 6.(canceled)
 7. (canceled)
 8. (canceled)
 9. (canceled)
 10. (canceled) 11.A method for nucleic acid library formation, said method comprising a.ligating a first single-stranded adaptor to a 5′ end of asingle-stranded nucleic acid fragment; b. ligating a secondsingle-stranded adaptor to a 3′ end of said single-stranded nucleic acidfragment, thereby generating a single-stranded nucleic acid fragmentcomprising a 5′ first single-stranded adaptor and a 3′ secondsingle-stranded adaptor following step a) and step b); and c. extendinga primer annealed to the second single-stranded adaptor to generate anextension product; d. performing polymerase chain reaction to amplifythe extension product, thereby generating amplified extension product;and e. sequencing said amplified extension product.
 12. The method ofclaim 11, wherein said ligating of step a) occurs before said ligatingof step b), wherein said ligating of step a) occurs in a reactionmixture that lacks said second single-stranded adaptor.
 13. The methodof claim 11, wherein said ligating of step b) occurs before saidligating of step a), and wherein said ligating of step b) occurs in areaction mixture that lacks said first single-stranded adaptor.
 14. Themethod of claim 11, further comprising pre-adenylating said secondsingle-stranded adaptor before step b).
 15. The method of claim 11,further comprising phosphorylating a 5′ end of said single-strandednucleic acid fragment before step a).
 16. The method of claim 11,further comprising pre-adenylating said single-stranded nucleic acidfragment before step a).
 17. The method of claim 11, further comprisingperforming a purification step to remove unligated first-single strandedadaptor after step a).
 18. The method of claim 11, further comprisingperforming a purification step to remove unligated second-singlestranded adaptor after step b).
 19. (canceled)
 20. (canceled) 21.(canceled)
 22. (canceled)
 23. (canceled)
 24. (canceled)
 25. (canceled)26. (canceled)
 27. (canceled)
 28. A method of generating a nucleic acidlibrary, said method comprising a. annealing a primer comprising a 5′phosphate to an RNA molecule; b. extending said primer to generate afirst cDNA strand; c. ligating a first single-stranded adaptor to a 5′end of said first cDNA strand, thereby generating a first cDNA strandligated to a first single-stranded adaptor; d. annealing atarget-specific oligonucleotide probe to a target sequence in said firstcDNA strand ligated to a first single-stranded adaptor, wherein saidtarget-specific oligonucleotide probe comprises a 3′ end that anneals tosaid target sequence and a 5′ end comprising a second adaptor; e.extending said annealed target-specific oligonucleotide probe, therebygenerating an extension product; and f. amplifying said extensionproduct using a first primer comprising sequence of said firstsingle-stranded adaptor and a second primer comprising sequence of saidsecond adaptor.
 29. The method of claim 28, wherein said RNA comprisesmRNA.
 30. The method of claim 28, wherein said primer comprises a randomprimer.
 31. The method of claim 30, wherein said random primer comprisesa random hexamer sequence.
 32. The method of claim 28, wherein saidtarget sequence comprises a gene sequence.
 33. The method of claim 28,wherein said first single-stranded adaptor and said second adaptor aredifferent.
 34. The method of claim 28, wherein said RNA moleculecomprises a junction between two genes resulting from a gene fusion. 35.The method of claim 34, wherein said gene fusion is associated withcancer.
 36. (canceled)
 37. (canceled)
 38. (canceled)
 39. (canceled) 40.(canceled)
 41. (canceled)
 42. (canceled)
 43. (canceled)
 44. (canceled)45. (canceled)
 46. (canceled)
 47. (canceled)
 48. (canceled)
 49. Themethod of claim 28, wherein said target specific oligonucleotidecomprises a sequence complementary to a region of a cancer-related geneor mRNA.
 50. The method of claim 28, wherein said primer comprises asequence complementary to a gene region associated with cancer.
 51. Themethod of claim 28, wherein said RNA comprises mRNA.
 52. The method ofclaim 28, further comprising performing massively parallel sequencing ofsaid amplified extension product.